The minimum information is which occurance of what pattern was encountered.
ODA like SGML object addressing should be employed, in the long term,
innitially, the nth occurance of "thisExactHTMLstring" would suffice.
../some/url#3#The
would select the 3rd "The" from the source.