parts of speech - Building a phrase structure of "On the weekend ..."
I'm reading Foundations of Statistical Natural Language Processing, and I'm doing one of the early exercises, trying to work out some of the language infliction about the word 'fun'.
- On the weekend the children had fun.
Trying to make a phrase structure parse of the above sentece, I'm not sure how to structure it. All formal grammars I've read describe a sentence as:
S → NP + VP
But I don't see how "on the weekend" could be a noun phrase? So far I've got this, but it doesn't seem right:
(PP On)(NP (D the)(N weekend))(S (NP (D the)(N children))(VP (V had)(NP (N fun))))
Resulting in this parse tree:
So my question is: What's the correct Part-of-speech tagging for (1)?
Answer
EDIT: Given the sentence:
On the weekend the children had fun.
You can get a dependency parse (described at the bottom of this posting) that looks like this:
Which I believe may be more of what you are looking for.
The Berkeley parser produces this parse using its simple online interface:
(ROOT
(S
(PP (IN On)
(NP (DT the) (NN weekend)))
(NP (DT the) (NNS children))
(VP (VBD had)
(NP (NN fun)))
(. .)))
Which can be diagrammed this way:
On the other hand, if you rearrange the sentence slightly:
The children had fun on the weekend.
You get this tree:
(ROOT
(S
(NP (DT The) (NNS children))
(VP (VBD had)
(NP
(NP (NN fun))
(PP (IN on)
(NP (DT the) (NN weekend)))))
(. .)))
Whose diagram is this:
For a simple parse, I think that is as good as you are going to get it. But you wish to try a constituency parser that can show linkages more complex that the simple tree above illustrates. For example, using CMU’s Link Grammar Parser:
+------------------------Xp-----------------------+
+---------------Wd--------------+ |
| +-----------CO-----------+ |
| +----Js---+ | |
| | +--Ds--+ +--Dmc--+---Sp--+--Os-+ |
| | | | | | | | |
LEFT-WALL on the weekend.n the children.n had.v fun.n .
Notice the CO
link applies that opener to the entire rest of the sentence. If you follow the [CO
link]’s docs(http://www.link.cs.cmu.edu/link/dict/section-CO.html), you find that this is “sentence opener” link, used to connect sentence openers to the subjects of sentences. It mentions, amongst other things:
Openers may take commas; almost all words with
CO+
therefore have"({{Xd-} & Xc+} & CO+)"
. With participles and adjectives, the comma is obligatory; "*Still upset about Joe they went to a movie" seems wrong. TheXd-
allows a comma before the phrase as well as after. This frequently happens if the opener is not at the beginning of the phrase: "They claimed that, on Tuesday, they went to a movie." If the opener begins the sentence, then a comma before the opener is of course incorrect, but we allow it. See "X: Comma phrases".
And indeed, if you add the comma, you get another element in the parse (the Xc
element), but nothing else changes, showing that these are equivalent:
+-------------------------Xp------------------------+
+----------------Wd---------------+ |
| +------------CO------------+ |
| +-------Xc------+ | |
| +----Js---+ | | |
| | +--Ds--+ | +--Dmc--+---Sp--+--Os-+ |
| | | | | | | | | |
LEFT-WALL on the weekend.n , the children.n had.v fun.n .
For more details, check out the Standford NLP Group’s page here. I have used some of those tools, and they take a fair bit of set-up that I would never wish on a non-programmer, but they can be quite interesting.
You may wish to also try a dependency parse. These can be harder to read, but provide better linkages. One dependency parse visualization tool can be downloaded from here.
If you jump through all their hoops, you get the following output:
Notice that at last we can see that the PP at the start of the sentence correctly applies to the VP.
Comments
Post a Comment