176 lines
10 KiB
Markdown
176 lines
10 KiB
Markdown
Ikiwiki currently only has one type of dependency between pages
|
|
(plus wikilinks special cased in on the side). This has resulted in various
|
|
problems, and it's seemed for a long time to me that ikiwiki needs to get
|
|
smarter about what types of dependencies are supported.
|
|
|
|
### unnecessary work
|
|
|
|
The current single dependency type causes the depending page to be rebuilt
|
|
whenever a matching dependency is added, removed, or *modified*. But a
|
|
great many things don't care about the modification case, and often cause
|
|
unnecessary page rebuilds:
|
|
|
|
* map only cares if the pages are added or removed. Content change does
|
|
not matter (unless show=title is used).
|
|
* brokenlinks, orphans, pagecount, ditto (generally)
|
|
* inline in archive mode cares about page title, author changing, but
|
|
not content. (Ditto for meta with show=title.)
|
|
* Causes extra work when solving the [[bugs/transitive_dependencies]]
|
|
problem.
|
|
|
|
### two types of dependencies needed for [[tracking_bugs_with_dependencies]]
|
|
|
|
>> it seems that there are two types of dependency, and ikiwiki
|
|
>> currently only handles one of them. The first type is "Rebuild this
|
|
>> page when any of these other pages changes" - ikiwiki handles this.
|
|
>> The second type is "rebuild this page when set of pages referred to by
|
|
>> this pagespec changes" - ikiwiki doesn't seem to handle this. I
|
|
>> suspect that named pagespecs would make that second type of dependency
|
|
>> more important. I'll try to come up with a good example. -- [[Will]]
|
|
|
|
>>> Hrm, I was going to build an example of this with backlinks, but it
|
|
>>> looks like that is handled as a special case at the moment (line 458 of
|
|
>>> render.pm). I'll see if I can breapk
|
|
>>> things another way. Fixing this properly would allow removal of that special case. -- [[Will]]
|
|
|
|
>>>> I can't quite understand the distinction you're trying to draw
|
|
>>>> between the two types of dependencies. Backlinks are a very special
|
|
>>>> case though and I'll be suprised if they fit well into pagespecs.
|
|
>>>> --[[Joey]]
|
|
|
|
>>>>> The issue is that the existential pagespec matching allows you to build things that have similar
|
|
>>>>> problems to backlinks.
|
|
>>>>> e.g. the following inline:
|
|
|
|
\[[!inline pages="define(~done, link(done)) and link(~done)" archive=yes]]
|
|
|
|
>>>>> includes any page that links to a page that links to done. Now imagine I add a new link to 'done' on
|
|
>>>>> some random page somewhere - a page which some other page links to which didn't previously get included - the set of pages accepted by the pagespec, and hence the set of
|
|
>>>>> pages inlined, will change. But, there is no dependency anywhere on the page that I altered, so
|
|
>>>>> ikiwiki will not rebuild the page with the inline in it. What is happening is that the page that I altered affects
|
|
>>>>> the set of pages matched by the pagespec without itself being matched by the pagespec, and hence included in the dependency list.
|
|
|
|
>>>>> To make this work well, I think you need to recognise two types of dependencies for each page (and no
|
|
>>>>> special cases for particular types of links, eg backlinks). The first type of dependency says, "The content of
|
|
>>>>> this page depends upon the content of these other pages". The `add_depends()` in the shortcuts
|
|
>>>>> plugin is of this form: any time the shortcuts page is edited, any page with a shortcut on it
|
|
>>>>> is rebuilt. The inline plugin also needs to add dependencies of this form to detect when the inlined
|
|
>>>>> content changes. By contrast, the map plugin does not need a dependency of this form, because it
|
|
>>>>> doesn't actually care about the content of any pages, just which pages it needs to include (which we'll handle next).
|
|
|
|
>>>>> The second type of dependency says, "The content of this page depends upon the exact set of pages matched
|
|
>>>>> by this pagespec". The first type of dependency was about the content of some pages, the second type is about
|
|
>>>>> which pages get matched by a pagespec. This is the type of dependency tracking that the map plugin needs.
|
|
>>>>> If the set of pages matched by map pagespec changes, then the page with the map on it needs to be rebuilt to show a different list of pages.
|
|
>>>>> Inline needs this type of dependency as well as the previous type - This type handles a change in which pages
|
|
>>>>> are inlined, the previous type handles a change in the content of any of those pages. Shortcut does not need this type of
|
|
>>>>> dependency. Most of the places that use `add_depends()` seem to need this type of dependency rather than the first type.
|
|
|
|
>>>>>> Note that inline and map currently achieve the second type of dependency by
|
|
>>>>>> explicitly calling `add_depends` for each page the displayed.
|
|
>>>>>> If any of those pages are removed, the regular pagespec would not
|
|
>>>>>> match them -- since they're gone. However, the explicit dependency
|
|
>>>>>> on them does cause them to match. It's an ugly corner I'd like to
|
|
>>>>>> get rid of. --[[Joey]]
|
|
|
|
>>>>> Implementation Details: The first type of dependency can be handled very similarly to the current
|
|
>>>>> dependency system. You just need to keep a list of pages that the content depends upon. You could
|
|
>>>>> keep that list as a pagespec, but if you do this you might want to check that the pagespec doesn't change,
|
|
>>>>> possibly by adding a dependency of the second type along with the dependency of the first type.
|
|
|
|
>>>>>> An example of the current system not tracking enough data is
|
|
>>>>>> described in [[bugs/transitive_dependencies]].
|
|
>>>>>> --[[Joey]]
|
|
|
|
>>>>> The second type of dependency is a little more tricky. For each page, we'd need a list of pagespecs that
|
|
>>>>> the page depended on, and for each pagespec you'd want to store the list of pages that currently match it.
|
|
>>>>> On refresh, you'd need to check each pagespec to see if the set of pages that match it has changed, and if
|
|
>>>>> that set has changed, then rebuild the dependent page(s). Oh, and for this second type of dependency, I
|
|
>>>>> don't think you can merge pagespecs. If I wanted to know if either "\*" or "link(done)" changes, then just checking
|
|
>>>>> to see if the set of pages matched by "\* or link(done)" changes doesn't work.
|
|
|
|
>>>>> The current system works because even though you usually want dependencies of the second type, the set of pages
|
|
>>>>> referred to by a pagespec can only change if one of those pages itself changes. i.e. A dependency check of the
|
|
>>>>> first type will catch a dependency change of the second type with current pagespecs.
|
|
>>>>> This doesn't work with backlinks, and it doesn't work with existential matching. Backlinks are currently special-cased. I don't know
|
|
>>>>> how to special-case existential matching - I suspect you're better off just getting the dependency tracking right.
|
|
|
|
>>>>> I also tried to come up with other possible solutions: e.g. can we find the dependencies for a pagespec? That
|
|
>>>>> would be the set of pages where a change on one of those pages could lead to a change in the set of pages matched by the pagespec.
|
|
>>>>> For old-style pagespecs without backlinks, the dependency set for a pagespec is the same as the set of pages the pagespec matches.
|
|
>>>>> Unfortunately, with existential matching, the set of pages that each
|
|
>>>>> pagespec depends upon can quickly become "*", which is not very useful. -- [[Will]]
|
|
|
|
### proposal
|
|
|
|
I propose the following. --[[Joey]]
|
|
|
|
* Add a second type of dependency, call it an "presence dependency".
|
|
* `add_depends` defaults to adding a regular ("full") dependency, as
|
|
before. (So nothing breaks.)
|
|
* `add_depends($page, $spec, presence => 0)` adds an presence dependency.
|
|
* `refresh` only looks at added/removed pages when resolving presence
|
|
dependencies.
|
|
|
|
This seems straightforwardly doable. I'd like [[Will]]'s feedback on it, if
|
|
possible. The type types of dependencies I am proposing are not identical
|
|
to the two types he talks about above, but I hope are close enough that
|
|
they can be used.
|
|
|
|
This doesn't deal with the stuff that only depend on the metadata of a
|
|
page, as collected in the scan pass, changing. But it does leave a window
|
|
open for adding such a dependency type later.
|
|
|
|
----
|
|
|
|
I implemented the above in a branch.
|
|
[[!template id=gitbranch branch=origin/dependency-types author="[[joey]]"]]
|
|
|
|
Then I found some problems:
|
|
|
|
* Something simple like pagecount, that seems like it could use a
|
|
presence dependency, can have a pagespec that uses metadata, like
|
|
`author()` or `copyright()`.
|
|
* pagestats, orphans and brokenlinks cannot use presence dependencies
|
|
because they need to update when links change.
|
|
|
|
Now I'm thinking about having a special dependency look at page
|
|
metadata, and fire if the metadata changes. And it seems links should
|
|
either be included in that, or there should be a way to make a dependency
|
|
that fires when a page's links change. (And what about backlinks?)
|
|
|
|
It's easy to see when a page's links change, since there is `%oldlinks`.
|
|
To see when metadata is changed is harder, since it's stored in the
|
|
pagestate by the meta plugin. Also, there are many different types of
|
|
metadata, that would need to be matched with the pagespecs somehow.
|
|
|
|
Quick alternative: Make add_depends look at the pagespec. Ie, if it
|
|
is a simple page name, or a glob, we know a presence dependency
|
|
can be valid. If's more complex, convert the dependency from
|
|
presence to full.
|
|
|
|
There is a lot to dislike about this method. Its parsing of the pagespec,
|
|
as currently implemented, does not let plugins add new types of pagespecs
|
|
that only care about presence. Its pagespec parsing is also subject to
|
|
false negatives (though these should be somewhat rare, and no false
|
|
positives). Still, it does work, and it makes things like simple maps and
|
|
pagecounts much more efficient.
|
|
|
|
----
|
|
|
|
Link dependencies:
|
|
|
|
* `add_depends($page, $spec, links => 1, presence => 1)`
|
|
adds a links + presence dependency.
|
|
* `refresh` only rebuilds a page with a links dependency if
|
|
pages matched by the pagespec gain or lose links. (What the link
|
|
actually points to may change independent of this, due to changes
|
|
elsewhere, without it firing.)
|
|
* So, brokenlinks can fire whenever any links in any of the
|
|
pages it's tracking change, or when pages are added or
|
|
removed.
|
|
|
|
TODO: How to determine if a pagespec is valid to be used with a links
|
|
dependency? Use the same simple pagespecs that are valid for presence
|
|
dependencies?
|