SyntaxTree functional specification
-----------------------------------

This document describes the typical errorneout usecases in html code and a description
how these should be marked and how the parsing shoult be recovered.

Feel free to add new case or correct the existing ones.

Notes:
======
1) never put unexpected tag into the nodes stack => cannot be paired
2) probably better is to say "Unknown tag" rather than "Unknown content"

Usecases:
=========

A. Crossed tags

1. <p> has optional end and can contain <a>
<p><a></p></a>
      ^^^^ error: Unmatched tag

ast: p(a())

2. <p> has optional end and cannot contain <div>
<p><div></p></div>
   ^^^^^  error: Unexpected content
            ^^^^^^ error: Unmatched tag

recovery: </p> is matched to the <p>
ast: p()

3. <a> has required end and can contain <script> with required end
<a><script></a></script>
^^^        ^^^^ error: unmatched tag

ast: a(script())

4. <a> has required end and cannot contain <div> with required end
<a><div></a></div>
   ^^^^^ error: Unexpected content
            ^^^^^^ error: Unmatched tag

ast: a()

4.1
<a><div><b></b></a></div>
   ^^^^^           ^^^^^^
ast: a(b())


B. Unexpected tag

a. known, paired
<html>
  ...
  <body>
    <style></style>
    ^^^^^^^^^^^^^^^ error:Unexpected tag content
  </body>
</html>

recovery: just ignore both tags and continue parsing

b. known, unpaired
<html>
  ...
  <body>
    <style>
    ^^^^^^^ error:Unexpected tag content
  </body>
</html>

c. unknown, paired or unpaired
<html>
  ...
  <body>
    <xyz>(</xyz>)
    ^^^^^ error:Unexpected tag content
         (^^^^^^) error: Unmatched tag, Unknown content (tag)
  </body>
</html>

C. Unresolved tags

<html>
    <head>
    ^^^^^^ error: unresolved tag, expecting XXX
        //title is mandatory here
    </head>
    <body>
    </body>
</html>

note: whether a tag is resolved or not doesn't affect the parsing/nesting of
its content or following siblings

Usecase studies
================
legend: s == stack

Study (1)
<body>              //s(body)
<div>               //s(body,div1)
    <div>           //s(body,div1,div2)
        <div>       //s(body,div1,div2,div3)
            <div>   //s(body,div1, div2, div3, div4)
        </div>      //s(body,div1,div2,div3)
    </div>          //s(body,div1,div2)
</div>              //s(body,div1) -- error, <body> not on the stack's top
</body>             ???

...at the end the stack still contain one <div> ... the first one ... so where
to flag the error?

Study (2)
<body>      //s(body)
<p>         //s(body,p1)
<p>         //s(body,p2) -- previous tag cannot contain this one && have optional end, so close it by the start of next <p>
</body>     -- p2 remains on stack, but has optional end so we can close it by the start of the close tag

Study A.1
<p>         //s(p)
<a>         //s(p,a)
</p>        //s(a) -- p is not top, but has optional end, while <a> has required end => so close it by the next stack item (a) and mark </p> end tag as unmatched
</a>        //s()

Study A.2
<p>         //s(p)
<div>       //s(p)-- div not allowed here, mark as error
</p>        //s()
</div>      -- unmatched, mark

Study A.3
<a>         //s(a)
<script>    //s(a,script)
</a>        //s(a,script) -- script is not top, a has required end, script as well => flag the </a> end tag as unmatched
</script>   //s(a) -- close script normally
            -- a stays in stack, mark as unmatched

Study A.4
<a>         //s(a)
<div>       //s(a) -- a cannot contain div, mark div as unallowed here
</a>        //s()
</div>      --div is unmatched end tag


