Finding linguistic patterns using spaCy
Contents
Finding linguistic patterns using spaCy#
This section teaches you to find linguistic patterns using spaCy, a natural language processing library for Python.
If you are unfamiliar with the linguistic annotations produced by spaCy or need to refresh your memory, revisit Part II before working through this section.
After reading this section, you should:
know how to search for patterns based on part-of-speech tags and morphological features
know how to search for patterns based on syntactic dependencies
know how to examine the matching patterns in their context of occurrence
Finding patterns using spaCy Matchers#
Linguistic annotations, such as part-of-speech tags, syntactic dependencies and morphological features, help impose structure on written language. Crucially, linguistic annotations allow searching for structural patterns instead of individual words or phrases. This allows defining search patterns in a flexible way.
In the spaCy library, the capability for pattern search is provided by various components named Matchers.
spaCy provides three types of Matchers:
A Matcher, which allows defining rules that search for particular words or phrases by examining Token attributes.
A DependencyMatcher, which allows searching parse trees for syntactic patterns.
A PhraseMatcher, a fast method for matching spaCy Doc objects to Doc objects.
The following sections show you how to use the Matcher for matching Tokens and their sequences based on their part-of-speech tags and morphological features, and how to use the DependencyMatcher for matching syntactic dependencies.
Matching words or phrases#
To get started with the Matcher, let’s import the spaCy library and load a small language model for English.
# Import the spaCy library into Python
import spacy
# Load a small language model for English; assign the result under 'nlp'
nlp = spacy.load('en_core_web_sm')
To have some data to work with, let’s load some text from a Wikipedia article.
To do so, we use Python’s open()
function in combination with the with
statement to open the file for reading, providing the file
, mode
and encoding
arguments, as instructed in Part II.
We then call the read()
method to read the file contents and store the result under the variable text
.
# Use the open() function with the 'with' statement to open the file for reading
with open(file='data/occupy.txt', mode='r', encoding='utf-8') as file:
# Use the read() method to read the contents of the file; assign result under
# the variable 'text'
text = file.read()
This returns a Python string object that contains the article in plain text, which is now available under the variable text
.
Next, we then feed this object to the language model under the variable nlp
as instructed in Part II.
We also use Python’s len()
function to count the number of words in the text.
# Feed the string object to the language model
doc = nlp(text)
# Use the len() function to check length of the Doc object to count
# how many Tokens are contained within the Doc.
len(doc)
14867
Now that we have a spaCy Doc object with nearly 15 000 Tokens, we can continue to import the Matcher class from the matcher
submodule of spaCy.
# Import the Matcher class
from spacy.matcher import Matcher
Importing the Matcher class from spaCy’s matcher
submodule allows creating Matcher objects.
When creating a Matcher object, you must provide the vocabulary of the language model used for finding matches to the Matcher object.
The reason for this is really rather simple: if you want to search for patterns in some language, you need to know its vocabulary first.
spaCy stores the vocabulary of a model in a Vocab object. The Vocab object can be found under the attribute vocab
of a spaCy Language object, which was introduced in Part II.
In this case, we have the Language object that contains a small language model for English stored under the variable nlp
, which means we can access its Vocab object under the attribute nlp.vocab
.
We then call the Matcher class and provide the vocabulary under nlp.vocab
to the vocab
argument to create a Matcher object. We store the resulting object under the variable matcher
.
# Create a Matcher and provide model vocabulary; assign result under the variable 'matcher'
matcher = Matcher(vocab=nlp.vocab)
# Call the variable to examine the object
matcher
<spacy.matcher.matcher.Matcher at 0x295de8140>
The Matcher object is now ready to store the patterns that we want to search for.
These patterns, or more specifically, pattern rules, are created using a specific format defined in spaCy.
Each pattern consists of a Python list, which is populated by Python dictionaries.
Each dictionary in this list describes the pattern for matching a single spaCy Token.
If you wish to match a sequence of Tokens, you must define multiple dictionaries within a single list, whose order follows that of the pattern to be matched.
Let’s start by defining a simple pattern for matching sequences of pronouns and verbs, and store this pattern under the variable pronoun_verb
.
This pattern consists of a list, as marked by the surrounding brackets []
, which contains two dictionaries, marked by curly braces {}
and separated by a comma. The key and value pairs in a dictionary are separated by a colon.
The dictionary key determines which Token attribute should be searched for matches. The attributes supported by the Matcher can be found here.
The value under the dictionary key determines the specific value for the attribute.
In this case, we define a pattern that searches for a sequence of two coarse part-of-speech tags (POS
), which were introduced in Part II, namely pronouns (PRON
) and verbs (VERB
).
Note that both keys and values must be provided in uppercase letters.
# Define a list with nested dictionaries that contains the pattern to be matched
pronoun_verb = [{'POS': 'PRON'}, {'POS': 'VERB'}]
Now that we have defined the pattern using a list and dictionaries, we can add it to the Matcher object under the variable matcher
.
This can be achieved using add()
method, which requires two inputs:
A Python string object that defines a name for the pattern. This is required for purposes of identification.
A list containing the pattern(s) to be searched for. Because a single rule for matching patterns can contain multiple patterns, the input must be a list of lists. We therefore wrap the input lists into brackets, e.g.
[pattern_1]
.
# Add the pattern to the matcher under the name 'pronoun+verb'
matcher.add("pronoun+verb", patterns=[pronoun_verb])
To search for matches in the Doc object stored under the variable doc
, we feed the Doc object to the Matcher and store the result under the variable result
.
We also set the optional argument as_spans
to True
, which instructs spaCy to return the results as Span objects.
As you may remember from Part II, Span objects correspond to continuous “slices” of Doc objects.
# Apply the Matcher to the Doc object under 'doc'; provide the argument
# 'as_spans' and set its value to True to get Spans as output
result = matcher(doc, as_spans=True)
# Call the variable to examine the output
result
[that expressed,
It aimed,
It formed,
this began,
it organizes,
that read,
who designed,
He wrote,
there were,
They promoted,
It refers,
which started,
that indicate,
that allowed,
they saw,
they argued,
they called,
it takes,
that reflected,
that strip,
they called,
that took,
who comment,
them using,
they belong,
which premiered,
himself warned,
he said,
they think,
them gain,
they wished,
that saw,
there was,
they blamed,
I support,
Some believe,
that followed,
It showed,
Some find,
Some believe,
there was,
which involves,
which showed,
who gave,
Some said,
they refused,
they saw,
who caused,
that drew,
there were,
who sought,
there were,
that lasted,
There was,
This came,
which shut,
who made,
that took,
This included,
which prohibit,
that hosted,
which monitors,
those wishing,
who criticized,
it returned,
its proposed,
They received,
that advocates,
there have,
There are,
it came,
it gained,
He claimed,
they presented,
which took,
they call,
that started,
which saw,
all set,
there were,
It consists,
there were,
there were,
which started,
there was,
which featured,
all finding,
What started,
there was,
There was,
Some said,
they began,
they perceived,
that threaten,
they say,
We agree,
we see,
There's,
There are,
who say,
they do,
they reflect,
He mentioned,
We regard,
who participated,
he wrote,
we have,
who dislike,
that burdens,
they employ,
they have,
there were,
that differs,
that follow,
there is,
that abstract,
it stall,
who emerged,
It pushes,
who called,
that deals,
that believe]
The output is a list of spaCy Span objects that match the requested pattern. Let’s examine the first object in the list of matches in greater detail.
result[0]
that expressed
The Span object has various useful attributes, including start
and end
. These attributes contain the indices that indicate where in the Doc object the Span starts and finishes.
result[0].start, result[0].end
(14, 16)
Another useful attribute is label
, which contains the name that we gave to the pattern. Let’s take a closer look at this attribute.
result[0].label
12298179334642351811
The number stored under the label
attribute is actually a spaCy Lexeme object that corresponds to an entry in the language model’s vocabulary.
This Lexeme contains the name that we gave to the search pattern above, namely pronoun+verb
.
We can easily verify this by using the value under result[0].label
to fetch the Lexeme from the Vocab object under nlp.vocab
and examining its text
attribute.
# Access the model vocabulary using brackets; provide the value under 'result[0].label' as key.
# Then get the 'text' attribute for the Lexeme object, which contains the lexeme in a human-readable form.
nlp.vocab[result[0].label].text
'pronoun+verb'
The information under the label
attribute is useful for disambiguating between patterns, especially if the same Matcher object contains multiple different patterns, as we will see shortly below.
Looking at the matches above, the pattern we defined is quite restrictive, as the pronoun and the verb must follow each other.
We cannot, for example, match patterns in which the verb is preceded by auxiliary verbs.
spaCy allows increasing the flexibility of pattern rules using operators.
These operators are defined by adding the key OP
to the dictionary that defines a pattern for a single Token. spaCy supports the following operators:
!
: Negate the pattern; the pattern can occur exactly zero times.?
: Make the pattern optional; the pattern may occur zero or one times.+
: Require the pattern to occur one or more times.*
: Allow the pattern to match zero or more times.
Let’s explore the use of operators by defining another pattern rule, which extends the scope of our Matcher.
To do so, we define another pattern for a Token between the pronoun and the verb. This Token must have the coarse part-of-speech tag AUX
, which indicates an auxiliary verb:
{'POS': 'AUX', 'OP': '+'}
In addition, we add another key and value pair to the dictionary for this Token, which contains the key OP
with the value +
. This means that the Token corresponding to an auxiliary verb must occur one or more times.
We store the resulting list with nested dictionaries under the variable pronoun_aux_verb
, and add the pattern to the existing Matcher object stored under the variable matcher
.
# Define a list with nested dictionaries that contains the pattern to be matched
pronoun_aux_verb = [{'POS': 'PRON'}, {'POS': 'AUX', 'OP': '+'}, {'POS': 'VERB'}]
# Add the pattern to the matcher under the name 'pronoun+aux+verb'
matcher.add('pronoun+aux+verb', patterns=[pronoun_aux_verb])
# Apply the Matcher to the Doc object under 'doc'; provide the argument 'as_spans'
# and set its value to True to get Spans as output. Overwrite previous matches by
# storing the result under the variable 'results'.
results = matcher(doc, as_spans=True)
Just as above, the Matcher returns a list of spaCy Span objects.
Let’s loop over each item in the list results
. We use the variable result
to refer to the individual Span objects in the list, which contain our matches.
We first retrieve the Lexeme object stored under result.label
, which we map to the language model’s Vocabulary under nlp.vocab
.
As we learned above, this Lexeme corresponds to the name that we gave to the pattern rule, whose human-readable form can be found under the attribute text
.
We then print a tabulator character to insert some space between the name of the pattern and the Span object containing the match.
# Loop over each Span object in the list 'results'
for result in results:
# Print out the the name of the pattern rule, a tabulator character, and the matching Span
print(nlp.vocab[result.label].text, '\t', result)
pronoun+verb that expressed
pronoun+verb It aimed
pronoun+verb It formed
pronoun+verb this began
pronoun+verb it organizes
pronoun+aux+verb that had resulted
pronoun+verb that read
pronoun+verb who designed
pronoun+verb He wrote
pronoun+verb there were
pronoun+verb They promoted
pronoun+verb It refers
pronoun+verb which started
pronoun+verb that indicate
pronoun+verb that allowed
pronoun+aux+verb they did have
pronoun+verb they saw
pronoun+verb they argued
pronoun+verb they called
pronoun+verb it takes
pronoun+aux+verb they were working
pronoun+verb that reflected
pronoun+aux+verb which has been gathered
pronoun+verb that strip
pronoun+aux+verb who had lost
pronoun+verb they called
pronoun+verb that took
pronoun+aux+verb themselves be informed
pronoun+aux+verb that can be traced
pronoun+verb who comment
pronoun+verb them using
pronoun+aux+verb anyone can join
pronoun+aux+verb what is called
pronoun+verb they belong
pronoun+verb which premiered
pronoun+verb himself warned
pronoun+verb he said
pronoun+verb they think
pronoun+aux+verb they will change
pronoun+aux+verb it can help
pronoun+verb them gain
pronoun+verb they wished
pronoun+verb that saw
pronoun+verb there was
pronoun+verb they blamed
pronoun+aux+verb It was organized
pronoun+verb I support
pronoun+aux+verb I saw expressed
pronoun+verb Some believe
pronoun+verb that followed
pronoun+verb It showed
pronoun+verb Some find
pronoun+verb Some believe
pronoun+verb there was
pronoun+aux+verb It was renamed
pronoun+verb which involves
pronoun+verb which showed
pronoun+verb who gave
pronoun+aux+verb they may want
pronoun+verb Some said
pronoun+verb they refused
pronoun+aux+verb who were arrested
pronoun+verb they saw
pronoun+verb who caused
pronoun+verb that drew
pronoun+verb there were
pronoun+verb who sought
pronoun+verb there were
pronoun+aux+verb It was reported
pronoun+aux+verb they were beginning
pronoun+aux+verb they had received
pronoun+verb that lasted
pronoun+verb There was
pronoun+aux+verb he would bring
pronoun+verb This came
pronoun+verb which shut
pronoun+verb who made
pronoun+verb that took
pronoun+verb This included
pronoun+aux+verb which were attended
pronoun+aux+verb that were cited
pronoun+verb which prohibit
pronoun+aux+verb they're obligated
pronoun+aux+verb which has provided
pronoun+verb that hosted
pronoun+verb which monitors
pronoun+aux+verb which is raising
pronoun+aux+verb which has developed
pronoun+verb those wishing
pronoun+aux+verb who were detained
pronoun+verb who criticized
pronoun+verb it returned
pronoun+verb its proposed
pronoun+verb They received
pronoun+verb that advocates
pronoun+aux+verb which is focused
pronoun+verb there have
pronoun+verb There are
pronoun+aux+verb it was torn
pronoun+aux+verb it has spread
pronoun+aux+verb whom were left
pronoun+verb it came
pronoun+verb it gained
pronoun+aux+verb This is attributed
pronoun+aux+verb which were focused
pronoun+verb He claimed
pronoun+verb they presented
pronoun+verb which took
pronoun+verb they call
pronoun+verb that started
pronoun+verb which saw
pronoun+verb all set
pronoun+aux+verb it was reported
pronoun+verb there were
pronoun+verb It consists
pronoun+verb there were
pronoun+verb there were
pronoun+verb which started
pronoun+verb there was
pronoun+aux+verb which was evicted
pronoun+verb which featured
pronoun+verb all finding
pronoun+verb What started
pronoun+aux+verb who were occupying
pronoun+aux+verb that could be used
pronoun+verb there was
pronoun+aux+verb It was expected
pronoun+aux+verb it was disbanded
pronoun+aux+verb it was fenced
pronoun+verb There was
pronoun+verb Some said
pronoun+verb they began
pronoun+verb they perceived
pronoun+verb that threaten
pronoun+verb they say
pronoun+verb We agree
pronoun+verb we see
pronoun+verb There's
pronoun+aux+verb I can understand
pronoun+aux+verb it will grow
pronoun+aux+verb it will bring
pronoun+verb There are
pronoun+verb who say
pronoun+aux+verb we can build
pronoun+verb they do
pronoun+aux+verb they're penalized
pronoun+verb they reflect
pronoun+verb He mentioned
pronoun+verb We regard
pronoun+verb who participated
pronoun+aux+verb they were removed
pronoun+verb he wrote
pronoun+verb we have
pronoun+aux+verb they have been protesting
pronoun+aux+verb they will have made
pronoun+verb who dislike
pronoun+verb that burdens
pronoun+verb they employ
pronoun+verb they have
pronoun+aux+verb it has cleared
pronoun+verb there were
pronoun+aux+verb which would overturn
pronoun+aux+verb it would have
pronoun+aux+verb what became known
pronoun+aux+verb that were scheduled
pronoun+verb that differs
pronoun+verb that follow
pronoun+verb there is
pronoun+verb that abstract
pronoun+verb it stall
pronoun+verb who emerged
pronoun+verb It pushes
pronoun+verb who called
pronoun+aux+verb whom have observed
pronoun+aux+verb who are running
pronoun+verb that deals
pronoun+aux+verb you're going
pronoun+verb that believe
The output shows that the pattern we added to the Matcher matches patterns that contain one (e.g. “we can build”) or more (e.g. “they have been protesting”) auxiliaries!
Matching morphological features#
As introduced in Part II, spaCy can also perform morphological analysis, whose results are stored under the attribute morph
of a Token object.
The morph
attribute contains a string object, in which each morphological feature is separated by a vertical bar |
, as illustrated below.
We Case=Nom|Number=Plur|Person=1|PronType=Prs
As you can see, particular types of morphological features, e.g. Case, and their type, e.g. Nom (for the nominative case) are separated by equal signs =
.
Let’s begin exploring how we can define pattern rules that match morphological features.
To get started, we create a new Matcher object named morph_matcher
.
# Create a Matcher and provide model vocabulary; assign result under the variable 'morph_matcher'
morph_matcher = Matcher(vocab=nlp.vocab)
We then define a new pattern with rules for two Tokens:
Tokens that have a fine-grained part-of-speech tag
NNP
(proper noun), which can occur one or more times (operator:+
).
{'TAG': 'NNP', 'OP': '+'}
Tokens that have a coarse part-of-speech tag
VERB
and have all the following morphological features (MORPH
):Number=Sing|Person=Three|Tense=Pres|VerbForm=Fin
.
{'POS': 'VERB', 'MORPH': 'Number=Sing|Person=Three|Tense=Pres|VerbForm=Fin'}
We define the pattern using two dictionaries in a list, which we assign under the variable propn_3rd_finite
.
# Define a list with nested dictionaries that contains the pattern to be matched
propn_3rd_finite = [{'TAG': 'NNP', 'OP': '+'},
{'POS': 'VERB', 'MORPH': 'Number=Sing|Person=Three|Tense=Pres|VerbForm=Fin'}]
We then add the pattern to the newly-created Matcher stored under the variable morph_matcher
using the add()
method.
We also provide the value LONGEST
to the optional argument greedy
for the add()
method.
The greedy
argument filters the matches for Tokens that include operators such as +
that search greedily for more than one match.
By setting the value to LONGEST
, spaCy returns the longest sequence of matches instead of returning a match every time it finds one. Put differently, spaCy will collect all the matching Tokens before returning them.
# Add the pattern to the matcher under the name 'sing_3rd_pres_fin'
morph_matcher.add('sing_3rd_pres_fin', patterns=[propn_3rd_finite], greedy='LONGEST')
We then apply the Matcher to the data stored under the variable doc
.
# Apply the Matcher to the Doc object under 'doc'; provide the argument 'as_spans'
# and set its value to True to get Spans as output. Overwrite previous matches by
# storing the result under the variable 'results'.
morph_results = morph_matcher(doc, as_spans=True)
# Loop over each Span object in the list 'morph_results'
for result in morph_results:
# Print result
print(result)
As you can see, the matches are relatively few in number, because we defined that the verb should have quite specific morphological features.
The question is, then, how can we match just some morphological features?
To loosen the criteria for morphological features by focusing on tense only, we need to use a dictionary with the key MORPH
, but instead of a string object, we provide a dictionary as its value:
For this dictionary, we use the string IS_SUPERSET
as the key. IS_SUPERSET
is one of the attributes defined in the spaCy pattern format, e.g.
{'MORPH': {'IS_SUPERSET': [...]}}
Before proceeding any further, let’s unpack the logic behind IS_SUPERSET
a bit.
We can think of morphological features associated with a given Token as a set. To exemplify, a set could consist of the following four items:
Number=Sing, Person=Three, Tense=Pres, VerbForm=Fin
If we would have another set with just one item, Tense=Pres
, we could describe the relationship between the two sets by stating that the first set (with four items) is a superset of the second set (with one item).
In other words, the larger (super)set contains the smaller (sub)set.
This is also how matching using IS_SUPERSET
works: spaCy retrieves the morphological features for each Token, and examines whether these features are a superset of the features defined in the search pattern.
The morphological features to be searched for are provided as a list of Python strings.
These strings, in turn, define particular morphological features, e.g. Tense=Past
, as defined in the Universal Dependencies schema for describing morphology, which was introduced in the previous section.
This list is then used as the value for the key IS_SUPERSET
.
Let’s now proceed to search for verbs in the past tense and add them to the Matcher object under morph_matcher
.
# Define a list with nested dictionaries that contains the pattern to be matched
past_tense = [{'TAG': 'NNP', 'OP': '+'},
{'POS': 'VERB', 'MORPH': {'IS_SUPERSET': ['Tense=Past']}}]
# Add the pattern to the matcher under the name 'past_tense'
morph_matcher.add('past_tense', patterns=[past_tense], greedy='LONGEST')
# Apply the Matcher to the Doc object under 'doc'; provide the argument 'as_spans'
# and set its value to True to get Spans as output. Overwrite previous matches by
# storing the result under the variable 'results'.
morph_results = morph_matcher(doc, as_spans=True)
Let’s loop over the results and print out the name of the pattern, the Span object containing the match, and the morphological features of the final Token in the match, which corresponds to the verb.
# Loop over each Span object in the list 'results'
for result in morph_results:
# Print out the the name of the pattern rule, a tabulator character, and the matching Span.
# Finally, print another tabulator character, followed by the morphological features of the
# last Token in the match (a verb).
print(nlp.vocab[result.label].text, '\t', result, '\t', result[-1].morph)
past_tense Community Environmental Legal Defense Fund released Tense=Past|VerbForm=Fin
past_tense Oakland Police Chief Howard Jordan expressed Tense=Past|VerbForm=Fin
past_tense U.S. Vice President Al Gore called Tense=Past|VerbForm=Fin
past_tense Los Angeles City Council became Tense=Past|VerbForm=Fin
past_tense Judge Jed S. Rakoff sided Tense=Past|VerbForm=Fin
past_tense Finance Minister Jim Flaherty expressed Tense=Past|VerbForm=Fin
past_tense Prime Minister Manmohan Singh described Tense=Past|VerbForm=Fin
past_tense Supreme Leader Ayatollah Khamenei voiced Tense=Past|VerbForm=Fin
past_tense Prime Minister Gordon Brown said Tense=Past|VerbForm=Fin
past_tense Anti-Defamation League stated Tense=Past|VerbForm=Fin
past_tense Occupy Wall Street endorsed Tense=Past|VerbForm=Fin
past_tense New York Times reported Tense=Past|VerbForm=Fin
past_tense Occupy Wall Street said Tense=Past|VerbForm=Fin
past_tense Lieutenant John Pike used Tense=Past|VerbForm=Fin
past_tense Occupy Wall Street attempted Tense=Past|VerbForm=Fin
past_tense Mayor Charlie Hales ordered Tense=Past|VerbForm=Fin
past_tense Pietro al Laterano received Tense=Past|VerbForm=Fin
past_tense Taksim Gezi Park developed Tense=Past|VerbForm=Fin
past_tense International Press Institute commented Tense=Past|VerbForm=Fin
past_tense President Dilma Rousseff said Tense=Past|VerbForm=Fin
past_tense Edinburgh City Council set Tense=Past|VerbForm=Fin
past_tense President Barack Obama spoke Tense=Past|VerbForm=Fin
past_tense New York City sent Tense=Past|VerbForm=Fin
past_tense President Hugo Chávez condemned Tense=Past|VerbForm=Fin
past_tense American Dialect Society voted Tense=Past|VerbForm=Fin
past_tense Manfred Steger called Tense=Past|VerbForm=Fin
past_tense Cornel West described Tense=Past|VerbForm=Fin
past_tense Huffington Post noted Tense=Past|VerbForm=Fin
past_tense Kalle Lasn registered Tense=Past|VerbForm=Fin
past_tense Democracy Village set Tense=Past|VerbForm=Fin
past_tense Naomi Wolf argued Tense=Past|VerbForm=Fin
past_tense Judith Butler criticized Tense=Past|VerbForm=Fin
past_tense People Link offered Tense=Past|VerbForm=Fin
past_tense Manuel Castells congratulated Tense=Past|VerbForm=Fin
past_tense Naomi Klein congratulated Tense=Past|VerbForm=Fin
past_tense USA Today said Tense=Past|VerbForm=Fin
past_tense Anthony Barnett said Tense=Past|VerbForm=Fin
past_tense Kanye West justified Tense=Past|VerbForm=Fin
past_tense Michael Moore tweeted Tense=Past|VerbForm=Fin
past_tense WikiLeaks Central began Tense=Past|VerbForm=Fin
past_tense Alexa O'Brien modeled Tense=Past|VerbForm=Fin
past_tense Richard Lambert suggested Tense=Past|VerbForm=Fin
past_tense Shannon Bond found Tense=Past|VerbForm=Fin
past_tense Washington Post reported Tense=Past|VerbForm=Fin
past_tense Occupy Nigeria began Tense=Past|VerbForm=Fin
past_tense January Jonathan responded Tense=Past|VerbForm=Fin
past_tense Hurricane Sandy hit Tense=Past|VerbForm=Fin
past_tense Bernie Sanders protested Tense=Past|VerbForm=Fin
past_tense Occupy Movement organized Tense=Past|VerbForm=Fin
past_tense Occupy Kalamazoo began Tense=Past|VerbForm=Fin
past_tense Occupy Sydney had Tense=Past|VerbForm=Fin
past_tense Pirate Party participated Tense=Past|VerbForm=Fin
past_tense United Nations controlled Aspect=Perf|Tense=Past|VerbForm=Part
past_tense Occupy Berlin established Tense=Past|VerbForm=Fin
past_tense High Court ruled Tense=Past|VerbForm=Fin
past_tense Irish Times described Tense=Past|VerbForm=Fin
past_tense Occupy Seoul contained Tense=Past|VerbForm=Fin
past_tense South Korea overcame Tense=Past|VerbForm=Fin
past_tense M Movement drew Tense=Past|VerbForm=Fin
past_tense Lancaster Police arrested Tense=Past|VerbForm=Fin
past_tense Occupy Belfast initiated Tense=Past|VerbForm=Fin
past_tense Occupy Belfast took Tense=Past|VerbForm=Fin
past_tense Occupy Coleraine took Tense=Past|VerbForm=Fin
past_tense Occupy Glasgow set Aspect=Perf|Tense=Past|VerbForm=Part
past_tense Occupy Cardiff set Tense=Past|VerbForm=Fin
past_tense Francis Fukuyama argued Tense=Past|VerbForm=Fin
past_tense American Progress suggested Tense=Past|VerbForm=Fin
past_tense Richard Branson said Tense=Past|VerbForm=Fin
past_tense Jesse Jackson said Tense=Past|VerbForm=Fin
past_tense Daily Telegraph reported Tense=Past|VerbForm=Fin
past_tense Financial Times argued Tense=Past|VerbForm=Fin
past_tense Paul Mason said Tense=Past|VerbForm=Fin
past_tense Atlantic Magazine declared Tense=Past|VerbForm=Fin
past_tense England stated Tense=Past|VerbForm=Fin
past_tense California occupied Tense=Past|VerbForm=Fin
past_tense Spain marked Tense=Past|VerbForm=Fin
past_tense Anonymous encouraged Tense=Past|VerbForm=Fin
past_tense U.S. saw Tense=Past|VerbForm=Fin
past_tense Wolf argued Tense=Past|VerbForm=Fin
past_tense Indymedia helped Tense=Past|VerbForm=Fin
past_tense Occupy related Aspect=Perf|Tense=Past|VerbForm=Part
past_tense WikiLeaks endorsed Tense=Past|VerbForm=Fin
past_tense October included Tense=Past|VerbForm=Fin
past_tense Gapper said Tense=Past|VerbForm=Fin
past_tense Occupy protested Tense=Past|VerbForm=Fin
past_tense Feds ordered Tense=Past|VerbForm=Fin
past_tense HSBC filed Tense=Past|VerbForm=Fin
past_tense Rome masked Tense=Past|VerbForm=Fin
past_tense Occupy began Tense=Past|VerbForm=Fin
past_tense Norway began Tense=Past|VerbForm=Fin
past_tense NEET troubled Tense=Past|VerbForm=Fin
past_tense June included Tense=Past|VerbForm=Fin
past_tense Vancouver added Tense=Past|VerbForm=Fin
past_tense Conan launched Tense=Past|VerbForm=Fin
past_tense Occupy influenced Tense=Past|VerbForm=Fin
past_tense FBI offered Tense=Past|VerbForm=Fin
past_tense FBI used Tense=Past|VerbForm=Fin
past_tense FBI withheld Tense=Past|VerbForm=Fin
past_tense FBI refused Tense=Past|VerbForm=Fin
past_tense Shapiro filed Tense=Past|VerbForm=Fin
As you can see, the past_tense
pattern can match objects based on a single morphological feature, although most matches share another morphological feature, namely the finite form.
Matching syntactic dependencies#
If you want to match patterns based on syntactic dependencies, you must use the DependencyMatcher class in spaCy.
As we learned in Part II, syntactic dependencies describe the relations that hold between Token objects.
To get started, let’s import the DependencyMatcher class from the matcher
submodule.
As you can see, the DependencyMatcher is initialised just as like the Matcher above.
# Import the DependencyMatcher class
from spacy.matcher import DependencyMatcher
# Create a DependencyMatcher and provide model vocabulary;
# assign result under the variable 'dep_matcher'
dep_matcher = DependencyMatcher(vocab=nlp.vocab)
This provides us with a DependencyMatcher stored under the variable dep_matcher
, which is now ready for storing dependency patterns.
When developing pattern rules for matching syntactic dependencies, the first step is to determine an “anchor” around which the pattern is built.
Visualising the syntactic dependencies, as instructed in Part II, can help formulate patterns.
Let’s import the displacy submodule to draw the syntactic dependencies for a sentence in the Doc object stored under the variable doc
.
# Import the displacy submodule from spaCy
from spacy import displacy
# Cast the sentences contained in the Doc object into a list; take the sentence
# at index 420. Set the 'style' attribute to 'dep' to draw syntactic dependencies.
displacy.render(list(doc.sents)[420], style='dep')
As introduced in Part III, syntactic dependencies are visualised using arcs that lead from the head Token to the dependent Token. The label of the arc gives the syntactic dependency.
Let’s define a pattern that searches for verbs and their nominal subjects (nsubj
).
Just as using the Matcher class, the pattern rules for the DependencyMatcher are defined using a list of dictionaries.
The first dictionary in the list defines an “anchor” pattern and its attributes.
You can think of the pattern rule as a chain that proceeds from left to right, and the first pattern on the left acts as an anchor for the subsequent patterns on its right-hand side.
Hence we define the following pattern for the anchor:
{'RIGHT_ID': 'verb', 'RIGHT_ATTRS': {'POS': 'VERB'}}
We use the required key RIGHT_ID
to provide a name for this pattern, which can be then used to refer to this pattern by subsequent patterns on its right-hand side.
In other words, when you see the key RIGHT_ID
, think of a name for the current pattern.
We then create a dictionary under the key RIGHT_ATTRS
that holds the linguistic features of the anchor. In this case, we determine that the anchor should have VERB
as its part-of-speech tag.
Next, we determine a pattern for the next “link” in the chain to the right of the anchor:
{'LEFT_ID': 'verb', 'REL_OP': '>', 'RIGHT_ID': 'subject', 'RIGHT_ATTRS': {'DEP': 'nsubj'}}
We start by providing the key LEFT_ID
, whose value is a string object that refers to the name of a pattern on the left-hand side of the current pattern. This is the name that we gave to the anchor using the key RIGHT_ID
.
Next, we use the key REL_OP
to define a relation operator, which determines the relationship between this pattern and that referred to using LEFT_ID
.
The relation operator >
defines that the pattern under LEFT_ID
– the anchor – should be the head of the current pattern.
Next, we name the current pattern using the key RIGHT_ID
, which enables referring to this pattern on the right-hand side, if necessary. We give this pattern the name subject
.
We then use the RIGHT_ATTRS
to determine the attributes for the current pattern. We define that the syntactic relation that holds between this pattern and that on the left should be nsubj
or nominal subject.
# Define a list with nested dictionaries that contains the pattern to be matched
dep_pattern = [{'RIGHT_ID': 'verb', 'RIGHT_ATTRS': {'POS': 'VERB'}},
{'LEFT_ID': 'verb', 'REL_OP': '>', 'RIGHT_ID': 'subject', 'RIGHT_ATTRS': {'DEP': 'nsubj'}}
]
We then compile these two dictionaries into a list, add the pattern to the DependencyMatcher under dep_matcher
and search the Doc object doc
for matches.
We store the resulting matches under the variable dep_matches
and call this variable to examine the output.
# Add the pattern to the matcher under the name 'nsubj_verb'
dep_matcher.add('nsubj_verb', patterns=[dep_pattern])
# Apply the DependencyMatcher to the Doc object under 'doc'; Store the result
# under the variable 'dep_matches'.
dep_matches = dep_matcher(doc)
# Call the variable to examine the output
dep_matches
[(5549296207297668001, [15, 14]),
(5549296207297668001, [37, 36]),
(5549296207297668001, [53, 52]),
(5549296207297668001, [62, 60]),
(5549296207297668001, [70, 69]),
(5549296207297668001, [81, 73]),
(5549296207297668001, [89, 87]),
(5549296207297668001, [100, 99]),
(5549296207297668001, [106, 105]),
(5549296207297668001, [133, 122]),
(5549296207297668001, [146, 144]),
(5549296207297668001, [172, 171]),
(5549296207297668001, [188, 184]),
(5549296207297668001, [207, 206]),
(5549296207297668001, [212, 211]),
(5549296207297668001, [223, 221]),
(5549296207297668001, [237, 235]),
(5549296207297668001, [311, 309]),
(5549296207297668001, [329, 328]),
(5549296207297668001, [349, 348]),
(5549296207297668001, [413, 404]),
(5549296207297668001, [449, 443]),
(5549296207297668001, [466, 464]),
(5549296207297668001, [501, 492]),
(5549296207297668001, [507, 506]),
(5549296207297668001, [544, 543]),
(5549296207297668001, [588, 587]),
(5549296207297668001, [596, 585]),
(5549296207297668001, [613, 612]),
(5549296207297668001, [633, 632]),
(5549296207297668001, [681, 678]),
(5549296207297668001, [735, 713]),
(5549296207297668001, [769, 754]),
(5549296207297668001, [809, 808]),
(5549296207297668001, [835, 832]),
(5549296207297668001, [866, 862]),
(5549296207297668001, [881, 880]),
(5549296207297668001, [895, 894]),
(5549296207297668001, [904, 902]),
(5549296207297668001, [937, 936]),
(5549296207297668001, [1048, 1047]),
(5549296207297668001, [1077, 1072]),
(5549296207297668001, [1138, 1130]),
(5549296207297668001, [1164, 1154]),
(5549296207297668001, [1188, 1179]),
(5549296207297668001, [1200, 1194]),
(5549296207297668001, [1209, 1208]),
(5549296207297668001, [1225, 1221]),
(5549296207297668001, [1228, 1227]),
(5549296207297668001, [1275, 1270]),
(5549296207297668001, [1290, 1289]),
(5549296207297668001, [1302, 1299]),
(5549296207297668001, [1369, 1366]),
(5549296207297668001, [1386, 1385]),
(5549296207297668001, [1404, 1391]),
(5549296207297668001, [1422, 1413]),
(5549296207297668001, [1434, 1432]),
(5549296207297668001, [1486, 1480]),
(5549296207297668001, [1508, 1503]),
(5549296207297668001, [1517, 1514]),
(5549296207297668001, [1542, 1541]),
(5549296207297668001, [1599, 1596]),
(5549296207297668001, [1618, 1617]),
(5549296207297668001, [1622, 1620]),
(5549296207297668001, [1631, 1628]),
(5549296207297668001, [1667, 1666]),
(5549296207297668001, [1689, 1688]),
(5549296207297668001, [1693, 1691]),
(5549296207297668001, [1703, 1702]),
(5549296207297668001, [1717, 1716]),
(5549296207297668001, [1774, 1773]),
(5549296207297668001, [1819, 1811]),
(5549296207297668001, [1828, 1824]),
(5549296207297668001, [1832, 1831]),
(5549296207297668001, [1854, 1853]),
(5549296207297668001, [1869, 1864]),
(5549296207297668001, [1879, 1878]),
(5549296207297668001, [1906, 1905]),
(5549296207297668001, [1935, 1923]),
(5549296207297668001, [1939, 1937]),
(5549296207297668001, [1948, 1947]),
(5549296207297668001, [1981, 1979]),
(5549296207297668001, [2001, 2000]),
(5549296207297668001, [2051, 2045]),
(5549296207297668001, [2081, 2080]),
(5549296207297668001, [2152, 2151]),
(5549296207297668001, [2163, 2162]),
(5549296207297668001, [2189, 2188]),
(5549296207297668001, [2199, 2197]),
(5549296207297668001, [2216, 2215]),
(5549296207297668001, [2224, 2223]),
(5549296207297668001, [2231, 2230]),
(5549296207297668001, [2268, 2262]),
(5549296207297668001, [2320, 2318]),
(5549296207297668001, [2387, 2386]),
(5549296207297668001, [2404, 2402]),
(5549296207297668001, [2414, 2413]),
(5549296207297668001, [2463, 2462]),
(5549296207297668001, [2471, 2468]),
(5549296207297668001, [2486, 2483]),
(5549296207297668001, [2491, 2489]),
(5549296207297668001, [2519, 2518]),
(5549296207297668001, [2596, 2595]),
(5549296207297668001, [2616, 2614]),
(5549296207297668001, [2625, 2623]),
(5549296207297668001, [2637, 2636]),
(5549296207297668001, [2647, 2644]),
(5549296207297668001, [2656, 2653]),
(5549296207297668001, [2660, 2659]),
(5549296207297668001, [2663, 2661]),
(5549296207297668001, [2686, 2684]),
(5549296207297668001, [2698, 2697]),
(5549296207297668001, [2723, 2722]),
(5549296207297668001, [2804, 2802]),
(5549296207297668001, [2806, 2805]),
(5549296207297668001, [2812, 2810]),
(5549296207297668001, [2830, 2825]),
(5549296207297668001, [2846, 2844]),
(5549296207297668001, [2870, 2869]),
(5549296207297668001, [2880, 2874]),
(5549296207297668001, [2885, 2882]),
(5549296207297668001, [2931, 2930]),
(5549296207297668001, [2947, 2946]),
(5549296207297668001, [2981, 2980]),
(5549296207297668001, [2987, 2986]),
(5549296207297668001, [2995, 2994]),
(5549296207297668001, [3004, 3000]),
(5549296207297668001, [3025, 3024]),
(5549296207297668001, [3028, 3027]),
(5549296207297668001, [3053, 3052]),
(5549296207297668001, [3068, 3065]),
(5549296207297668001, [3071, 3070]),
(5549296207297668001, [3076, 3075]),
(5549296207297668001, [3087, 3086]),
(5549296207297668001, [3100, 3097]),
(5549296207297668001, [3102, 3096]),
(5549296207297668001, [3119, 3118]),
(5549296207297668001, [3134, 3131]),
(5549296207297668001, [3141, 3138]),
(5549296207297668001, [3151, 3149]),
(5549296207297668001, [3161, 3160]),
(5549296207297668001, [3188, 3186]),
(5549296207297668001, [3196, 3195]),
(5549296207297668001, [3230, 3229]),
(5549296207297668001, [3235, 3234]),
(5549296207297668001, [3248, 3247]),
(5549296207297668001, [3279, 3278]),
(5549296207297668001, [3310, 3309]),
(5549296207297668001, [3316, 3315]),
(5549296207297668001, [3328, 3327]),
(5549296207297668001, [3353, 3341]),
(5549296207297668001, [3353, 3352]),
(5549296207297668001, [3361, 3360]),
(5549296207297668001, [3368, 3364]),
(5549296207297668001, [3392, 3391]),
(5549296207297668001, [3424, 3423]),
(5549296207297668001, [3450, 3449]),
(5549296207297668001, [3455, 3454]),
(5549296207297668001, [3489, 3475]),
(5549296207297668001, [3567, 3566]),
(5549296207297668001, [3588, 3581]),
(5549296207297668001, [3605, 3599]),
(5549296207297668001, [3631, 3630]),
(5549296207297668001, [3647, 3646]),
(5549296207297668001, [3682, 3681]),
(5549296207297668001, [3718, 3716]),
(5549296207297668001, [3724, 3723]),
(5549296207297668001, [3737, 3736]),
(5549296207297668001, [3794, 3788]),
(5549296207297668001, [3829, 3820]),
(5549296207297668001, [3839, 3837]),
(5549296207297668001, [3845, 3844]),
(5549296207297668001, [3854, 3852]),
(5549296207297668001, [3881, 3880]),
(5549296207297668001, [3894, 3893]),
(5549296207297668001, [3904, 3903]),
(5549296207297668001, [3908, 3906]),
(5549296207297668001, [3935, 3926]),
(5549296207297668001, [3940, 3939]),
(5549296207297668001, [3968, 3957]),
(5549296207297668001, [3991, 3990]),
(5549296207297668001, [4007, 3998]),
(5549296207297668001, [4018, 4016]),
(5549296207297668001, [4040, 4039]),
(5549296207297668001, [4046, 4043]),
(5549296207297668001, [4061, 4060]),
(5549296207297668001, [4064, 4063]),
(5549296207297668001, [4077, 4076]),
(5549296207297668001, [4095, 4086]),
(5549296207297668001, [4104, 4100]),
(5549296207297668001, [4129, 4128]),
(5549296207297668001, [4139, 4138]),
(5549296207297668001, [4145, 4144]),
(5549296207297668001, [4160, 4156]),
(5549296207297668001, [4162, 4161]),
(5549296207297668001, [4180, 4179]),
(5549296207297668001, [4185, 4184]),
(5549296207297668001, [4200, 4199]),
(5549296207297668001, [4221, 4220]),
(5549296207297668001, [4235, 4234]),
(5549296207297668001, [4252, 4245]),
(5549296207297668001, [4279, 4277]),
(5549296207297668001, [4291, 4290]),
(5549296207297668001, [4294, 4293]),
(5549296207297668001, [4309, 4305]),
(5549296207297668001, [4337, 4334]),
(5549296207297668001, [4377, 4376]),
(5549296207297668001, [4393, 4392]),
(5549296207297668001, [4400, 4397]),
(5549296207297668001, [4427, 4426]),
(5549296207297668001, [4429, 4423]),
(5549296207297668001, [4438, 4432]),
(5549296207297668001, [4541, 4540]),
(5549296207297668001, [4570, 4569]),
(5549296207297668001, [4583, 4579]),
(5549296207297668001, [4605, 4604]),
(5549296207297668001, [4631, 4621]),
(5549296207297668001, [4677, 4671]),
(5549296207297668001, [4693, 4692]),
(5549296207297668001, [4706, 4704]),
(5549296207297668001, [4881, 4880]),
(5549296207297668001, [4920, 4919]),
(5549296207297668001, [4939, 4937]),
(5549296207297668001, [5020, 5019]),
(5549296207297668001, [5037, 5036]),
(5549296207297668001, [5060, 5058]),
(5549296207297668001, [5074, 5065]),
(5549296207297668001, [5080, 5079]),
(5549296207297668001, [5097, 5095]),
(5549296207297668001, [5130, 5129]),
(5549296207297668001, [5139, 5138]),
(5549296207297668001, [5158, 5156]),
(5549296207297668001, [5176, 5174]),
(5549296207297668001, [5188, 5186]),
(5549296207297668001, [5204, 5202]),
(5549296207297668001, [5265, 5264]),
(5549296207297668001, [5270, 5267]),
(5549296207297668001, [5302, 5301]),
(5549296207297668001, [5337, 5331]),
(5549296207297668001, [5363, 5362]),
(5549296207297668001, [5373, 5372]),
(5549296207297668001, [5382, 5379]),
(5549296207297668001, [5387, 5375]),
(5549296207297668001, [5399, 5398]),
(5549296207297668001, [5431, 5427]),
(5549296207297668001, [5446, 5443]),
(5549296207297668001, [5455, 5454]),
(5549296207297668001, [5460, 5458]),
(5549296207297668001, [5485, 5483]),
(5549296207297668001, [5506, 5500]),
(5549296207297668001, [5520, 5519]),
(5549296207297668001, [5525, 5524]),
(5549296207297668001, [5540, 5538]),
(5549296207297668001, [5574, 5558]),
(5549296207297668001, [5589, 5588]),
(5549296207297668001, [5605, 5603]),
(5549296207297668001, [5626, 5625]),
(5549296207297668001, [5635, 5634]),
(5549296207297668001, [5656, 5652]),
(5549296207297668001, [5687, 5682]),
(5549296207297668001, [5718, 5714]),
(5549296207297668001, [5731, 5730]),
(5549296207297668001, [5745, 5744]),
(5549296207297668001, [5776, 5775]),
(5549296207297668001, [5788, 5787]),
(5549296207297668001, [5798, 5797]),
(5549296207297668001, [5821, 5820]),
(5549296207297668001, [5846, 5841]),
(5549296207297668001, [5875, 5874]),
(5549296207297668001, [5916, 5910]),
(5549296207297668001, [5961, 5960]),
(5549296207297668001, [5966, 5965]),
(5549296207297668001, [5995, 5994]),
(5549296207297668001, [6016, 6015]),
(5549296207297668001, [6036, 6034]),
(5549296207297668001, [6061, 6060]),
(5549296207297668001, [6066, 6064]),
(5549296207297668001, [6092, 6091]),
(5549296207297668001, [6112, 6108]),
(5549296207297668001, [6124, 6122]),
(5549296207297668001, [6137, 6136]),
(5549296207297668001, [6142, 6140]),
(5549296207297668001, [6153, 6152]),
(5549296207297668001, [6162, 6161]),
(5549296207297668001, [6184, 6183]),
(5549296207297668001, [6200, 6198]),
(5549296207297668001, [6221, 6218]),
(5549296207297668001, [6230, 6228]),
(5549296207297668001, [6249, 6193]),
(5549296207297668001, [6279, 6273]),
(5549296207297668001, [6292, 6290]),
(5549296207297668001, [6314, 6312]),
(5549296207297668001, [6323, 6321]),
(5549296207297668001, [6330, 6327]),
(5549296207297668001, [6359, 6358]),
(5549296207297668001, [6382, 6381]),
(5549296207297668001, [6400, 6392]),
(5549296207297668001, [6450, 6449]),
(5549296207297668001, [6491, 6488]),
(5549296207297668001, [6511, 6510]),
(5549296207297668001, [6521, 6519]),
(5549296207297668001, [6533, 6528]),
(5549296207297668001, [6548, 6547]),
(5549296207297668001, [6552, 6550]),
(5549296207297668001, [6560, 6559]),
(5549296207297668001, [6571, 6570]),
(5549296207297668001, [6589, 6588]),
(5549296207297668001, [6617, 6616]),
(5549296207297668001, [6632, 6630]),
(5549296207297668001, [6641, 6640]),
(5549296207297668001, [6706, 6705]),
(5549296207297668001, [6747, 6744]),
(5549296207297668001, [6762, 6754]),
(5549296207297668001, [6783, 6782]),
(5549296207297668001, [6822, 6821]),
(5549296207297668001, [6851, 6850]),
(5549296207297668001, [6872, 6871]),
(5549296207297668001, [6905, 6904]),
(5549296207297668001, [6926, 6925]),
(5549296207297668001, [6967, 6966]),
(5549296207297668001, [6987, 6986]),
(5549296207297668001, [7034, 7033]),
(5549296207297668001, [7048, 7047]),
(5549296207297668001, [7077, 7074]),
(5549296207297668001, [7084, 7083]),
(5549296207297668001, [7099, 7095]),
(5549296207297668001, [7115, 7114]),
(5549296207297668001, [7202, 7201]),
(5549296207297668001, [7250, 7248]),
(5549296207297668001, [7269, 7268]),
(5549296207297668001, [7280, 7279]),
(5549296207297668001, [7293, 7292]),
(5549296207297668001, [7302, 7301]),
(5549296207297668001, [7311, 7310]),
(5549296207297668001, [7339, 7338]),
(5549296207297668001, [7363, 7362]),
(5549296207297668001, [7412, 7408]),
(5549296207297668001, [7511, 7510]),
(5549296207297668001, [7557, 7556]),
(5549296207297668001, [7563, 7562]),
(5549296207297668001, [7577, 7576]),
(5549296207297668001, [7625, 7624]),
(5549296207297668001, [7654, 7653]),
(5549296207297668001, [7698, 7688]),
(5549296207297668001, [7734, 7732]),
(5549296207297668001, [7758, 7757]),
(5549296207297668001, [7797, 7792]),
(5549296207297668001, [7802, 7801]),
(5549296207297668001, [7846, 7845]),
(5549296207297668001, [7854, 7851]),
(5549296207297668001, [7880, 7876]),
(5549296207297668001, [7910, 7908]),
(5549296207297668001, [7932, 7931]),
(5549296207297668001, [7961, 7958]),
(5549296207297668001, [7974, 7973]),
(5549296207297668001, [7990, 7989]),
(5549296207297668001, [7998, 7997]),
(5549296207297668001, [8004, 8002]),
(5549296207297668001, [8021, 8020]),
(5549296207297668001, [8032, 8028]),
(5549296207297668001, [8098, 8097]),
(5549296207297668001, [8113, 8112]),
(5549296207297668001, [8118, 8116]),
(5549296207297668001, [8161, 8160]),
(5549296207297668001, [8192, 8191]),
(5549296207297668001, [8271, 8270]),
(5549296207297668001, [8355, 8354]),
(5549296207297668001, [8358, 8357]),
(5549296207297668001, [8383, 8382]),
(5549296207297668001, [8423, 8421]),
(5549296207297668001, [8477, 8476]),
(5549296207297668001, [8498, 8497]),
(5549296207297668001, [8513, 8512]),
(5549296207297668001, [8575, 8574]),
(5549296207297668001, [8591, 8590]),
(5549296207297668001, [8610, 8609]),
(5549296207297668001, [8624, 8621]),
(5549296207297668001, [8630, 8629]),
(5549296207297668001, [8657, 8645]),
(5549296207297668001, [8692, 8691]),
(5549296207297668001, [8694, 8680]),
(5549296207297668001, [8732, 8731]),
(5549296207297668001, [8756, 8740]),
(5549296207297668001, [8760, 8759]),
(5549296207297668001, [8778, 8777]),
(5549296207297668001, [8782, 8780]),
(5549296207297668001, [8853, 8851]),
(5549296207297668001, [8872, 8871]),
(5549296207297668001, [8890, 8889]),
(5549296207297668001, [8917, 8902]),
(5549296207297668001, [8927, 8926]),
(5549296207297668001, [8942, 8941]),
(5549296207297668001, [8965, 8964]),
(5549296207297668001, [8989, 8988]),
(5549296207297668001, [9024, 9023]),
(5549296207297668001, [9048, 9047]),
(5549296207297668001, [9071, 9070]),
(5549296207297668001, [9085, 9084]),
(5549296207297668001, [9125, 9124]),
(5549296207297668001, [9175, 9174]),
(5549296207297668001, [9191, 9190]),
(5549296207297668001, [9222, 9220]),
(5549296207297668001, [9256, 9253]),
(5549296207297668001, [9283, 9279]),
(5549296207297668001, [9357, 9356]),
(5549296207297668001, [9368, 9367]),
(5549296207297668001, [9423, 9421]),
(5549296207297668001, [9476, 9475]),
(5549296207297668001, [9489, 9488]),
(5549296207297668001, [9572, 9569]),
(5549296207297668001, [9597, 9595]),
(5549296207297668001, [9639, 9638]),
(5549296207297668001, [9675, 9673]),
(5549296207297668001, [9700, 9697]),
(5549296207297668001, [9726, 9725]),
(5549296207297668001, [9774, 9773]),
(5549296207297668001, [9805, 9800]),
(5549296207297668001, [9810, 9809]),
(5549296207297668001, [9845, 9842]),
(5549296207297668001, [9857, 9854]),
(5549296207297668001, [9878, 9876]),
(5549296207297668001, [9928, 9927]),
(5549296207297668001, [9966, 9965]),
(5549296207297668001, [9986, 9984]),
(5549296207297668001, [10049, 10048]),
(5549296207297668001, [10059, 10058]),
(5549296207297668001, [10075, 10071]),
(5549296207297668001, [10079, 10077]),
(5549296207297668001, [10127, 10088]),
(5549296207297668001, [10127, 10126]),
(5549296207297668001, [10135, 10134]),
(5549296207297668001, [10160, 10155]),
(5549296207297668001, [10224, 10210]),
(5549296207297668001, [10255, 10254]),
(5549296207297668001, [10313, 10312]),
(5549296207297668001, [10343, 10342]),
(5549296207297668001, [10367, 10357]),
(5549296207297668001, [10375, 10372]),
(5549296207297668001, [10395, 10394]),
(5549296207297668001, [10447, 10446]),
(5549296207297668001, [10479, 10477]),
(5549296207297668001, [10529, 10528]),
(5549296207297668001, [10537, 10535]),
(5549296207297668001, [10551, 10550]),
(5549296207297668001, [10641, 10638]),
(5549296207297668001, [10704, 10703]),
(5549296207297668001, [10709, 10708]),
(5549296207297668001, [10758, 10757]),
(5549296207297668001, [10763, 10762]),
(5549296207297668001, [10774, 10773]),
(5549296207297668001, [10786, 10785]),
(5549296207297668001, [10807, 10804]),
(5549296207297668001, [10822, 10821]),
(5549296207297668001, [10838, 10837]),
(5549296207297668001, [10852, 10850]),
(5549296207297668001, [10895, 10893]),
(5549296207297668001, [10905, 10904]),
(5549296207297668001, [10922, 10921]),
(5549296207297668001, [10989, 10986]),
(5549296207297668001, [11002, 10998]),
(5549296207297668001, [11016, 11015]),
(5549296207297668001, [11028, 11027]),
(5549296207297668001, [11046, 11044]),
(5549296207297668001, [11101, 11100]),
(5549296207297668001, [11136, 11135]),
(5549296207297668001, [11172, 11171]),
(5549296207297668001, [11247, 11246]),
(5549296207297668001, [11281, 11274]),
(5549296207297668001, [11287, 11286]),
(5549296207297668001, [11308, 11307]),
(5549296207297668001, [11343, 11342]),
(5549296207297668001, [11359, 11358]),
(5549296207297668001, [11363, 11361]),
(5549296207297668001, [11379, 11378]),
(5549296207297668001, [11391, 11390]),
(5549296207297668001, [11400, 11399]),
(5549296207297668001, [11420, 11419]),
(5549296207297668001, [11464, 11454]),
(5549296207297668001, [11474, 11473]),
(5549296207297668001, [11497, 11491]),
(5549296207297668001, [11513, 11512]),
(5549296207297668001, [11526, 11524]),
(5549296207297668001, [11541, 11540]),
(5549296207297668001, [11549, 11548]),
(5549296207297668001, [11554, 11552]),
(5549296207297668001, [11573, 11572]),
(5549296207297668001, [11577, 11576]),
(5549296207297668001, [11614, 11613]),
(5549296207297668001, [11629, 11623]),
(5549296207297668001, [11671, 11657]),
(5549296207297668001, [11699, 11690]),
(5549296207297668001, [11720, 11719]),
(5549296207297668001, [11721, 11718]),
(5549296207297668001, [11739, 11737]),
(5549296207297668001, [11755, 11754]),
(5549296207297668001, [11780, 11779]),
(5549296207297668001, [11804, 11803]),
(5549296207297668001, [11808, 11807]),
(5549296207297668001, [11818, 11816]),
(5549296207297668001, [11830, 11829]),
(5549296207297668001, [11846, 11845]),
(5549296207297668001, [11898, 11896]),
(5549296207297668001, [11903, 11901]),
(5549296207297668001, [11920, 11919]),
(5549296207297668001, [11948, 11947]),
(5549296207297668001, [11962, 11960]),
(5549296207297668001, [11967, 11965]),
(5549296207297668001, [11991, 11990]),
(5549296207297668001, [12006, 12005]),
(5549296207297668001, [12013, 12011]),
(5549296207297668001, [12043, 12040]),
(5549296207297668001, [12050, 12049]),
(5549296207297668001, [12077, 12076]),
(5549296207297668001, [12109, 12108]),
(5549296207297668001, [12126, 12117]),
(5549296207297668001, [12146, 12145]),
(5549296207297668001, [12224, 12223]),
(5549296207297668001, [12243, 12245]),
(5549296207297668001, [12254, 12249]),
(5549296207297668001, [12280, 12274]),
(5549296207297668001, [12319, 12318]),
(5549296207297668001, [12351, 12335]),
(5549296207297668001, [12383, 12373]),
(5549296207297668001, [12436, 12431]),
(5549296207297668001, [12456, 12454]),
(5549296207297668001, [12473, 12472]),
(5549296207297668001, [12511, 12510]),
(5549296207297668001, [12521, 12519]),
(5549296207297668001, [12550, 12541]),
(5549296207297668001, [12556, 12554]),
(5549296207297668001, [12578, 12573]),
(5549296207297668001, [12598, 12586]),
(5549296207297668001, [12607, 12603]),
(5549296207297668001, [12654, 12649]),
(5549296207297668001, [12664, 12663]),
(5549296207297668001, [12685, 12684]),
(5549296207297668001, [12699, 12696]),
(5549296207297668001, [12710, 12709]),
(5549296207297668001, [12720, 12717]),
(5549296207297668001, [12731, 12730]),
(5549296207297668001, [12759, 12756]),
(5549296207297668001, [12790, 12779]),
(5549296207297668001, [12810, 12809]),
(5549296207297668001, [12821, 12820]),
(5549296207297668001, [12828, 12824]),
(5549296207297668001, [12877, 12859]),
(5549296207297668001, [12887, 12884]),
(5549296207297668001, [12899, 12897]),
(5549296207297668001, [12917, 12916]),
(5549296207297668001, [12933, 12932]),
(5549296207297668001, [12955, 12954]),
(5549296207297668001, [12976, 12972]),
(5549296207297668001, [13003, 13002]),
(5549296207297668001, [13027, 13025]),
(5549296207297668001, [13031, 13030]),
(5549296207297668001, [13036, 13035]),
(5549296207297668001, [13050, 13048]),
(5549296207297668001, [13056, 13055]),
(5549296207297668001, [13084, 13081]),
(5549296207297668001, [13116, 13098]),
(5549296207297668001, [13127, 13126]),
(5549296207297668001, [13130, 13129]),
(5549296207297668001, [13160, 13159]),
(5549296207297668001, [13184, 13183]),
(5549296207297668001, [13190, 13187]),
(5549296207297668001, [13196, 13192]),
(5549296207297668001, [13217, 13216]),
(5549296207297668001, [13232, 13231]),
(5549296207297668001, [13235, 13233]),
(5549296207297668001, [13251, 13250]),
(5549296207297668001, [13304, 13298]),
(5549296207297668001, [13330, 13324]),
(5549296207297668001, [13355, 13353]),
(5549296207297668001, [13370, 13367]),
(5549296207297668001, [13396, 13395]),
(5549296207297668001, [13414, 13413]),
(5549296207297668001, [13419, 13417]),
(5549296207297668001, [13444, 13438]),
(5549296207297668001, [13453, 13451]),
(5549296207297668001, [13501, 13499]),
(5549296207297668001, [13505, 13504]),
(5549296207297668001, [13509, 13507]),
(5549296207297668001, [13521, 13518]),
(5549296207297668001, [13545, 13536]),
(5549296207297668001, [13569, 13567]),
(5549296207297668001, [13608, 13607]),
(5549296207297668001, [13629, 13626]),
(5549296207297668001, [13650, 13649]),
(5549296207297668001, [13656, 13654]),
(5549296207297668001, [13689, 13687]),
(5549296207297668001, [13721, 13703]),
(5549296207297668001, [13724, 13723]),
(5549296207297668001, [13739, 13738]),
(5549296207297668001, [13770, 13768]),
(5549296207297668001, [13785, 13782]),
(5549296207297668001, [13794, 13793]),
(5549296207297668001, [13795, 13792]),
(5549296207297668001, [13804, 13802]),
(5549296207297668001, [13823, 13821]),
(5549296207297668001, [13835, 13834]),
(5549296207297668001, [13887, 13886]),
(5549296207297668001, [13918, 13890]),
(5549296207297668001, [13947, 13946]),
(5549296207297668001, [13968, 13967]),
(5549296207297668001, [13988, 13987]),
(5549296207297668001, [14010, 14008]),
(5549296207297668001, [14029, 14028]),
(5549296207297668001, [14051, 14049]),
(5549296207297668001, [14073, 14071]),
(5549296207297668001, [14120, 14119]),
(5549296207297668001, [14173, 14172]),
(5549296207297668001, [14219, 14204]),
(5549296207297668001, [14273, 14272]),
(5549296207297668001, [14278, 14277]),
(5549296207297668001, [14321, 14320]),
(5549296207297668001, [14337, 14334]),
(5549296207297668001, [14371, 14370]),
(5549296207297668001, [14382, 14380]),
(5549296207297668001, [14393, 14392]),
(5549296207297668001, [14422, 14418]),
(5549296207297668001, [14431, 14430]),
(5549296207297668001, [14434, 14433]),
(5549296207297668001, [14458, 14453]),
(5549296207297668001, [14499, 14492]),
(5549296207297668001, [14501, 14500]),
(5549296207297668001, [14551, 14550]),
(5549296207297668001, [14561, 14554]),
(5549296207297668001, [14576, 14573]),
(5549296207297668001, [14597, 14594]),
(5549296207297668001, [14635, 14634]),
(5549296207297668001, [14670, 14669]),
(5549296207297668001, [14682, 14681]),
(5549296207297668001, [14711, 14710]),
(5549296207297668001, [14727, 14696]),
(5549296207297668001, [14744, 14740]),
(5549296207297668001, [14759, 14757]),
(5549296207297668001, [14773, 14772]),
(5549296207297668001, [14783, 14781]),
(5549296207297668001, [14826, 14816]),
(5549296207297668001, [14838, 14837]),
(5549296207297668001, [14844, 14842]),
(5549296207297668001, [14849, 14848]),
(5549296207297668001, [14859, 14856])]
Unlike the Matcher, the DependencyMatcher cannot return the matches as Span objects, because the matches do not necessarily form a continuous sequence of Tokens needed for a Span object.
Thus the DependencyMatcher returns a list of tuples.
Each tuple contains two items:
A Lexeme object that gives the name of the pattern
A list of indices for Tokens that match the search pattern in the Doc object
# Loop over each tuple in the list 'dep_matches'
for match in dep_matches:
# Take the first item in the tuple at [0] and assign it under
# the variable 'pattern_name'. This item is a spaCy Lexeme object.
pattern_name = match[0]
# Take the second item in the tuple at [1] and assign it under
# the variable 'matches'. This is a list of indices referring to the
# Doc object under 'doc' that we just matched.
matches = match[1]
# Let's unpack the matches list into variables for clarity
verb, subject = matches[0], matches[1]
# Print the matches by first fetching the name of the pattern from the
# Vocabulary object. Next, use the 'subject' and 'verb' variables to
# index the Doc object. This gives us the actual Tokens matched. Use a
# tabulator ('\t') and some stops ('...') to separate the output.
print(nlp.vocab[pattern_name].text, '\t', doc[subject], '...', doc[verb])
nsubj_verb that ... expressed
nsubj_verb It ... aimed
nsubj_verb movement ... had
nsubj_verb groups ... had
nsubj_verb concerns ... included
nsubj_verb corporations ... control
nsubj_verb that ... benefited
nsubj_verb It ... formed
nsubj_verb Steger ... called
nsubj_verb Occupy ... began
nsubj_verb protests ... taken
nsubj_verb movement ... became
nsubj_verb protests ... started
nsubj_verb repression ... remained
nsubj_verb this ... began
nsubj_verb police ... attempted
nsubj_verb authorities ... cleared
nsubj_verb movement ... uses
nsubj_verb it ... organizes
nsubj_verb West ... described
nsubj_verb Director ... stated
nsubj_verb students ... occupied
nsubj_verb that ... resulted
nsubj_verb slogan ... emerged
nsubj_verb Post ... noted
nsubj_verb that ... read
nsubj_verb who ... designed
nsubj_verb White ... traveled
nsubj_verb He ... wrote
nsubj_verb movement ... began
nsubj_verb camping ... marked
nsubj_verb leader ... called
nsubj_verb Foundation ... proposed
nsubj_verb Lasn ... registered
nsubj_verb we ... floated
nsubj_verb it ... snowballed
nsubj_verb Village ... set
nsubj_verb protest ... received
nsubj_verb group ... encouraged
nsubj_verb They ... promoted
nsubj_verb It ... refers
nsubj_verb percent ... tripled
nsubj_verb incomes ... grew
nsubj_verb % ... saw
nsubj_verb income ... decreased
nsubj_verb that ... increased
nsubj_verb taxation ... became
nsubj_verb earners ... saw
nsubj_verb income ... increase
nsubj_verb % ... owned
nsubj_verb % ... owned
nsubj_verb % ... owned
nsubj_verb % ... owning
nsubj_verb which ... started
nsubj_verb share ... grew
nsubj_verb that ... grew
nsubj_verb Recession ... caused
nsubj_verb income ... grew
nsubj_verb % ... went
nsubj_verb who ... had
nsubj_verb that ... indicate
nsubj_verb Lasn ... said
nsubj_verb that ... allowed
nsubj_verb movement ... grow
nsubj_verb Adbusters ... trying
nsubj_verb Wolf ... argued
nsubj_verb Wolf ... argued
nsubj_verb they ... have
nsubj_verb they ... saw
nsubj_verb magazine ... stated
nsubj_verb protesters ... wanted
nsubj_verb commentators ... criticized
nsubj_verb movement ... defined
nsubj_verb they ... argued
nsubj_verb movement ... seeks
nsubj_verb contingent ... released
nsubj_verb they ... called
nsubj_verb it ... takes
nsubj_verb Occupy ... said
nsubj_verb they ... working
nsubj_verb that ... reflected
nsubj_verb Activists ... used
nsubj_verb Indymedia ... helped
nsubj_verb provider ... offered
nsubj_verb movement ... went
nsubj_verb Fund ... released
nsubj_verb that ... strip
nsubj_verb Homes ... embarked
nsubj_verb who ... lost
nsubj_verb they ... called
nsubj_verb that ... took
nsubj_verb group ... planned
nsubj_verb Much ... occurs
nsubj_verb This ... features
nsubj_verb who ... comment
nsubj_verb anyone ... join
nsubj_verb Street ... uses
nsubj_verb they ... belong
nsubj_verb women ... get
nsubj_verb males ... wait
nsubj_verb turn ... speak
nsubj_verb movement ... began
nsubj_verb which ... premiered
nsubj_verb Sharp ... warned
nsubj_verb movement ... employing
nsubj_verb he ... said
nsubj_verb protesters ... have
nsubj_verb they ... achieve
nsubj_verb they ... think
nsubj_verb they ... change
nsubj_verb Protest ... accomplishes
nsubj_verb Castells ... congratulated
nsubj_verb Castells ... said
nsubj_verb it ... help
nsubj_verb them ... gain
nsubj_verb they ... make
nsubj_verb Group ... endorsed
nsubj_verb occupiers ... upheld
nsubj_verb journalists ... saying
nsubj_verb branch ... accept
nsubj_verb who ... signed
nsubj_verb Klein ... congratulated
nsubj_verb sources ... began
nsubj_verb camps ... responded
nsubj_verb occupiers ... sign
nsubj_verb they ... wished
nsubj_verb Hampton ... said
nsubj_verb Barnett ... said
nsubj_verb nonviolence ... remained
nsubj_verb that ... saw
nsubj_verb protestors ... said
nsubj_verb police ... initiated
nsubj_verb others ... said
nsubj_verb they ... blamed
nsubj_verb who ... take
nsubj_verb protester ... stated
nsubj_verb I ... support
nsubj_verb who ... have
nsubj_verb I ... support
nsubj_verb I ... expressed
nsubj_verb movement ... relied
nsubj_verb accounts ... became
nsubj_verb Some ... believe
nsubj_verb that ... followed
nsubj_verb interests ... changed
nsubj_verb It ... showed
nsubj_verb ratio ... dropping
nsubj_verb Some ... find
nsubj_verb celebrities ... made
nsubj_verb West ... justified
nsubj_verb celebrities ... tweeted
nsubj_verb Moore ... tweeted
nsubj_verb Many ... hold
nsubj_verb success ... led
nsubj_verb Some ... believe
nsubj_verb people ... used
nsubj_verb WikiLeaks ... endorsed
nsubj_verb Central ... began
nsubj_verb editor ... modeled
nsubj_verb protests ... began
nsubj_verb activists ... repeated
nsubj_verb list ... included
nsubj_verb protesters ... gathered
nsubj_verb people ... stayed
nsubj_verb protesters ... started
nsubj_verb officers ... used
nsubj_verb which ... involves
nsubj_verb which ... showed
nsubj_verb attention ... resulted
nsubj_verb Haberman ... said
nsubj_verb protesters ... choose
nsubj_verb who ... gave
nsubj_verb they ... want
nsubj_verb protesters ... set
nsubj_verb Times ... reported
nsubj_verb Some ... said
nsubj_verb police ... tricked
nsubj_verb Myerson ... said
nsubj_verb cops ... watched
nsubj_verb spokesman ... said
nsubj_verb they ... refused
nsubj_verb group ... filed
nsubj_verb officers ... violated
nsubj_verb judge ... ruled
nsubj_verb protesters ... received
nsubj_verb evidence ... showed
nsubj_verb police ... warning
nsubj_verb Rakoff ... sided
nsubj_verb officer ... known
nsubj_verb horn ... communicate
nsubj_verb demonstration ... swelled
nsubj_verb marchers ... joining
nsubj_verb protests ... continued
nsubj_verb Thousands ... joined
nsubj_verb protesters ... marching
nsubj_verb scuffles ... erupted
nsubj_verb protesters ... tried
nsubj_verb Police ... responded
nsubj_verb protesters ... organized
nsubj_verb they ... saw
nsubj_verb One ... said
nsubj_verb Government ... made
nsubj_verb who ... caused
nsubj_verb crisis ... get
nsubj_verb people ... pay
nsubj_verb thousands ... staging
nsubj_verb people ... protested
nsubj_verb protesters ... carried
nsubj_verb We ... bail
nsubj_verb that ... drew
nsubj_verb protest ... turned
nsubj_verb Thousands ... gathered
nsubj_verb police ... cleared
nsubj_verb Jordan ... expressed
nsubj_verb police ... suffered
nsubj_verb who ... sought
nsubj_verb Olsen ... suffered
nsubj_verb protesters ... shut
nsubj_verb Police ... estimated
nsubj_verb 4,500 ... marched
nsubj_verb protesters ... held
nsubj_verb people ... took
nsubj_verb police ... removed
nsubj_verb authorities ... stepped
nsubj_verb Lambert ... suggested
nsubj_verb it ... disband
nsubj_verb Gapper ... offered
nsubj_verb Gapper ... said
nsubj_verb they ... beginning
nsubj_verb Pike ... used
nsubj_verb incident ... drew
nsubj_verb Katehi ... resign
nsubj_verb occupiers ... checked
nsubj_verb they ... received
nsubj_verb occupiers ... begun
nsubj_verb Bond ... found
nsubj_verb issues ... included
nsubj_verb Homes ... joined
nsubj_verb activists ... planted
nsubj_verb that ... lasted
nsubj_verb Post ... reported
nsubj_verb which ... disbanded
nsubj_verb some ... facing
nsubj_verb Nigeria ... began
nsubj_verb most ... took
nsubj_verb strikes ... shutting
nsubj_verb Jonathan ... responded
nsubj_verb he ... bring
nsubj_verb 2012 ... seen
nsubj_verb universities ... begun
nsubj_verb course ... includes
nsubj_verb students ... join
nsubj_verb teams ... planning
nsubj_verb LLC ... reached
nsubj_verb agreement ... resolved
nsubj_verb workers ... work
nsubj_verb This ... came
nsubj_verb which ... shut
nsubj_verb goals ... included
nsubj_verb poll ... found
nsubj_verb supporters ... outweighed
nsubj_verb Occupy ... protested
nsubj_verb Street ... attempted
nsubj_verb who ... made
nsubj_verb movement ... marked
nsubj_verb that ... took
nsubj_verb This ... included
nsubj_verb members ... gathered
nsubj_verb movement ... celebrated
nsubj_verb activists ... set
nsubj_verb activists ... claimed
nsubj_verb which ... prohibit
nsubj_verb occupiers ... claim
nsubj_verb beings ... need
nsubj_verb people ... protect
nsubj_verb activists ... said
nsubj_verb vigil ... continue
nsubj_verb Hales ... ordered
nsubj_verb movement ... transformed
nsubj_verb campaigns ... emerged
nsubj_verb campaigns ... include
nsubj_verb which ... provided
nsubj_verb Sandy ... hit
nsubj_verb that ... hosted
nsubj_verb which ... monitors
nsubj_verb which ... raising
nsubj_verb individual ... re
nsubj_verb which ... developed
nsubj_verb program ... worked
nsubj_verb hundreds ... protested
nsubj_verb supporters ... protesting
nsubj_verb Sanders ... received
nsubj_verb protestors ... claiming
nsubj_verb networks ... blacked
nsubj_verb spirit ... lives
nsubj_verb anarchism ... began
nsubj_verb Cafe ... continues
nsubj_verb Movement ... organized
nsubj_verb groups ... emerged
nsubj_verb group ... swarmed
nsubj_verb it ... shutdown
nsubj_verb hundreds ... took
nsubj_verb blockade ... caused
nsubj_verb building ... shutdown
nsubj_verb staff ... citing
nsubj_verb Feds ... ordered
nsubj_verb officers ... moved
nsubj_verb Kalamazoo ... began
nsubj_verb efforts ... received
nsubj_verb who ... criticized
nsubj_verb protesters ... faced
nsubj_verb demonstrations ... continuing
nsubj_verb leader ... named
nsubj_verb demonstrations ... took
nsubj_verb protesters ... defied
nsubj_verb Occupiers ... returned
nsubj_verb Sydney ... had
nsubj_verb it ... returned
nsubj_verb demonstration ... took
nsubj_verb movement ... had
nsubj_verb people ... attended
nsubj_verb Three ... took
nsubj_verb one ... took
nsubj_verb protests ... included
nsubj_verb protesters ... say
nsubj_verb Ghent ... began
nsubj_verb They ... received
nsubj_verb that ... advocates
nsubj_verb protests ... taken
nsubj_verb people ... gathered
nsubj_verb 150 ... stayed
nsubj_verb people ... marched
nsubj_verb 100 ... continued
nsubj_verb 1,000 ... gathered
nsubj_verb people ... occupied
nsubj_verb people ... occupied
nsubj_verb group ... occupied
nsubj_verb protestors ... began
nsubj_verb Party ... participated
nsubj_verb Police ... dissolved
nsubj_verb protesters ... started
nsubj_verb movement ... used
nsubj_verb protesters ... showed
nsubj_verb camp ... lived
nsubj_verb movement ... shifted
nsubj_verb protesters ... started
nsubj_verb relations ... varied
nsubj_verb police ... joined
nsubj_verb people ... joined
nsubj_verb protests ... begun
nsubj_verb students ... began
nsubj_verb it ... spread
nsubj_verb movement ... began
nsubj_verb Occupy ... took
nsubj_verb Berlin ... established
nsubj_verb protests ... took
nsubj_verb Police ... reported
nsubj_verb people ... protested
nsubj_verb people ... took
nsubj_verb organisers ... claimed
nsubj_verb HSBC ... filed
nsubj_verb Court ... ruled
nsubj_verb protesters ... leave
nsubj_verb people ... gathered
nsubj_verb protests ... occurred
nsubj_verb Laterano ... received
nsubj_verb protesters ... had
nsubj_verb fingers ... amputated
nsubj_verb people ... occupied
nsubj_verb movement ... held
nsubj_verb people ... took
nsubj_verb movement ... spread
nsubj_verb Occupy ... began
nsubj_verb government ... guarantee
nsubj_verb it ... came
nsubj_verb Police ... remained
nsubj_verb Mexico ... achieve
nsubj_verb it ... gained
nsubj_verb protesters ... failed
nsubj_verb protests ... occurred
nsubj_verb movement ... drew
nsubj_verb protesters ... remained
nsubj_verb Ganbaatar ... announced
nsubj_verb association ... joins
nsubj_verb He ... claimed
nsubj_verb bankers ... charging
nsubj_verb protesters ... gathered
nsubj_verb protesters ... created
nsubj_verb they ... presented
nsubj_verb demands ... investigate
nsubj_verb which ... took
nsubj_verb demands ... focused
nsubj_verb protests ... took
nsubj_verb protests ... began
nsubj_verb protest ... started
nsubj_verb they ... call
nsubj_verb police ... moved
nsubj_verb police ... said
nsubj_verb that ... started
nsubj_verb movement ... ended
nsubj_verb which ... saw
nsubj_verb it ... sells
nsubj_verb movement ... began
nsubj_verb movement ... met
nsubj_verb Times ... described
nsubj_verb group ... has
nsubj_verb who ... invited
nsubj_verb people ... took
nsubj_verb camp ... survived
nsubj_verb group ... occupying
nsubj_verb camp ... lasted
nsubj_verb It ... consists
nsubj_verb groups ... adopted
nsubj_verb Hundreds ... held
nsubj_verb Protesters ... focused
nsubj_verb which ... started
nsubj_verb One ... argued
nsubj_verb Korea ... overcame
nsubj_verb series ... demands
nsubj_verb protesters ... consider
nsubj_verb media ... related
nsubj_verb Movement ... drew
nsubj_verb protesters ... demonstrated
nsubj_verb protesters ... established
nsubj_verb protests ... developed
nsubj_verb which ... featured
nsubj_verb reaction ... caused
nsubj_verb protests ... widen
nsubj_verb people ... finding
nsubj_verb all ... finding
nsubj_verb What ... started
nsubj_verb Demands ... included
nsubj_verb protests ... spread
nsubj_verb protesters ... gathered
nsubj_verb Police ... sealed
nsubj_verb people ... gathered
nsubj_verb canon ... said
nsubj_verb people ... exercise
nsubj_verb protests ... occurred
nsubj_verb camps ... took
nsubj_verb protests ... focused
nsubj_verb Police ... arrested
nsubj_verb who ... occupying
nsubj_verb police ... arrested
nsubj_verb body ... occupied
nsubj_verb camp ... lasted
nsubj_verb police ... swept
nsubj_verb police ... dragging
nsubj_verb Police ... said
nsubj_verb protesters ... remained
nsubj_verb group ... says
nsubj_verb that ... challenge
nsubj_verb Belfast ... initiated
nsubj_verb Belfast ... took
nsubj_verb It ... took
nsubj_verb Derry ... take
nsubj_verb Coleraine ... took
nsubj_verb group ... protested
nsubj_verb Council ... backed
nsubj_verb Protesters ... set
nsubj_verb council ... obtained
nsubj_verb council ... agreed
nsubj_verb Cardiff ... set
nsubj_verb Cardiff ... set
nsubj_verb protests ... began
nsubj_verb movement ... rejects
nsubj_verb police ... discovered
nsubj_verb march ... resulted
nsubj_verb Police ... used
nsubj_verb march ... received
nsubj_verb protesters ... attempted
nsubj_verb Some ... said
nsubj_verb police ... tricked
nsubj_verb they ... began
nsubj_verb officers ... cleared
nsubj_verb Police ... fired
nsubj_verb organizers ... said
nsubj_verb Olsen ... suffered
nsubj_verb witnesses ... believed
nsubj_verb protesters ... shut
nsubj_verb Police ... estimated
nsubj_verb 4,500 ... marched
nsubj_verb police ... cleared
nsubj_verb journalists ... complained
nsubj_verb police ... made
nsubj_verb journalists ... responded
nsubj_verb they ... perceived
nsubj_verb that ... threaten
nsubj_verb McKenzie ... commented
nsubj_verb authorities ... honour
nsubj_verb Homes ... embarked
nsubj_verb they ... say
nsubj_verb who ... made
nsubj_verb that ... took
nsubj_verb movement ... took
nsubj_verb protesters ... returned
nsubj_verb Rousseff ... said
nsubj_verb We ... agree
nsubj_verb movements ... used
nsubj_verb we ... see
nsubj_verb Flaherty ... expressed
nsubj_verb He ... commented
nsubj_verb I ... understand
nsubj_verb Singh ... described
nsubj_verb Khamenei ... voiced
nsubj_verb it ... grow
nsubj_verb it ... bring
nsubj_verb Brown ... said
nsubj_verb who ... say
nsubj_verb we ... build
nsubj_verb people ... take
nsubj_verb they ... do
nsubj_verb they ... reflect
nsubj_verb He ... mentioned
nsubj_verb politics ... speaks
nsubj_verb Council ... set
nsubj_verb We ... regard
nsubj_verb Edinburgh ... stated
nsubj_verb States ... spoke
nsubj_verb authorities ... collaborated
nsubj_verb who ... participated
nsubj_verb authorities ... sent
nsubj_verb administration ... worked
nsubj_verb Venezuela ... condemned
nsubj_verb Affairs ... had
nsubj_verb Fukuyama ... argued
nsubj_verb he ... wrote
nsubj_verb populism ... taken
nsubj_verb survey ... suggested
nsubj_verb movement ... succeeded
nsubj_verb protesters ... lent
nsubj_verb message ... declared
nsubj_verb interests ... cater
nsubj_verb generation ... grown
nsubj_verb we ... have
nsubj_verb Branson ... said
nsubj_verb they ... protesting
nsubj_verb community ... takes
nsubj_verb they ... made
nsubj_verb Jackson ... said
nsubj_verb which ... sweeping
nsubj_verb survey ... found
nsubj_verb many ... reported
nsubj_verb who ... dislike
nsubj_verb Support ... varied
nsubj_verb Australia ... reporting
nsubj_verb impacts ... include
nsubj_verb protests ... helped
nsubj_verb Americans ... face
nsubj_verb that ... burdens
nsubj_verb movement ... appears
nsubj_verb print ... mentioned
nsubj_verb occupation ... began
nsubj_verb interest ... waned
nsubj_verb movement ... raised
nsubj_verb organizers ... consider
nsubj_verb unions ... become
nsubj_verb they ... employ
nsubj_verb protest ... provided
nsubj_verb Offshoots ... bought
nsubj_verb individuals ... owe
nsubj_verb they ... have
nsubj_verb Jubilee ... claims
nsubj_verb Chomsky ... argues
nsubj_verb movement ... created
nsubj_verb that ... exist
nsubj_verb people ... doing
nsubj_verb Jubilee ... reports
nsubj_verb it ... cleared
nsubj_verb Telegraph ... reported
nsubj_verb members ... voted
nsubj_verb shows ... using
nsubj_verb Office ... made
nsubj_verb City ... added
nsubj_verb Conan ... launched
nsubj_verb Times ... argued
nsubj_verb movement ... had
nsubj_verb commentators ... suggested
nsubj_verb movement ... had
nsubj_verb Economist ... reported
nsubj_verb protesters ... caused
nsubj_verb government ... pass
nsubj_verb banks ... claw
nsubj_verb Deutch ... introduced
nsubj_verb which ... overturn
nsubj_verb Gore ... called
nsubj_verb It ... works
nsubj_verb Mason ... said
nsubj_verb movement ... started
nsubj_verb it ... have
nsubj_verb journalists ... suggested
nsubj_verb Occupy ... influenced
nsubj_verb movement ... creating
nsubj_verb Inequality ... remained
nsubj_verb he ... mentions
nsubj_verb analysts ... say
nsubj_verb which ... reflects
nsubj_verb Occupy ... become
nsubj_verb inequality ... become
nsubj_verb Magazine ... declared
nsubj_verb protests ... began
nsubj_verb FBI ... formed
nsubj_verb Banks ... met
nsubj_verb FBI ... offered
nsubj_verb officials ... met
nsubj_verb officials ... met
nsubj_verb FBI ... used
nsubj_verb which ... gave
nsubj_verb DSAC ... coordinated
nsubj_verb organizations ... filed
nsubj_verb FBI ... withheld
nsubj_verb Shapiro ... sent
nsubj_verb FBI ... refused
nsubj_verb Shapiro ... filed
nsubj_verb document ... confirmed
nsubj_verb it ... opened
nsubj_verb critique ... concerns
nsubj_verb movement ... focused
nsubj_verb that ... differs
nsubj_verb dominance ... becomes
nsubj_verb that ... follow
nsubj_verb Practicality ... dominates
nsubj_verb that ... rationalize
nsubj_verb activists ... seen
nsubj_verb it ... stall
nsubj_verb Dean ... argues
nsubj_verb focus ... paved
nsubj_verb Emphasis ... encouraged
nsubj_verb Celebration ... heightened
nsubj_verb who ... emerged
nsubj_verb anarchism ... suggests
nsubj_verb It ... pushes
nsubj_verb who ... called
nsubj_verb Remarks ... sparked
nsubj_verb many ... observed
nsubj_verb protests ... included
nsubj_verb Jews ... control
nsubj_verb who ... running
nsubj_verb Foxman ... stated
nsubj_verb that ... deals
nsubj_verb you ... going
nsubj_verb that ... believe
nsubj_verb they ... expressing
This returns us the verbs and their nominal subjects.
Note that when defining pattern rules for dependency matching, you can also create new “chains” that start from the anchor pattern.
For example, to find the direct objects (dobj
) for the verbs matched above, we should not add this as a link to the existing chain whose rightmost item is currently named subject
.
Instead, we need to start a new chain that begins from the anchor pattern verb
.
{'LEFT_ID': 'verb', 'REL_OP': '>', 'RIGHT_ID': 'd_object', 'RIGHT_ATTRS': {'DEP': 'dobj'}}
Just as above, we define that this pattern should be on the right-hand side of the pattern verb
, essentially starting a new chain.
Furthermore, the pattern verb
should govern this node (>
) and have the relation dobj
. We also name this pattern d_object
using the RIGHT_ID
attribute.
Let’s define a new pattern and add it to the DependencyMatcher object.
# Define a list with nested dictionaries that contains the pattern to be matched
dep_pattern_2 = [{'RIGHT_ID': 'verb', 'RIGHT_ATTRS': {'POS': 'VERB'}},
{'LEFT_ID': 'verb', 'REL_OP': '>', 'RIGHT_ID': 'subject', 'RIGHT_ATTRS': {'DEP': 'nsubj'}},
{'LEFT_ID': 'verb', 'REL_OP': '>', 'RIGHT_ID': 'd_object', 'RIGHT_ATTRS': {'DEP': 'dobj'}}
]
# Add the pattern to the matcher under the name 'nsubj_verb'
dep_matcher.add('nsubj_verb_dobj', patterns=[dep_pattern_2])
# Apply the DependencyMatcher to the Doc object under 'doc'; Store the result
# under the variable 'dep_matches'.
dep_matches = dep_matcher(doc)
# Loop over each tuple in the list 'dep_matches'
for match in dep_matches:
# Take the first item in the tuple at [0] and assign it under
# the variable 'pattern_name'. This item is a spaCy Lexeme object.
pattern_name = match[0]
# Take the second item in the tuple at [1] and assign it under
# the variable 'matches'. This is a list of indices referring to the
# Doc object under 'doc' that we just matched.
matches = match[1]
# Because we now have two patterns for matching which return lists of
# different length, e.g. lists with two indices for 'nsubj_verb' and
# lists with three indices for 'nsubj_verb_dobj', we must now define
# conditional criteria for handling these lists.
if len(matches) > 2:
# Let's unpack the matches list into variables for clarity
verb, subject, dobject = matches[0], matches[1], matches[2]
# Print the matches by first fetching the name of the pattern from the
# Vocabulary object. Next, use the 'subject' and 'verb' variables to
# index the Doc object. This gives us the actual Tokens matched. Use a
# tabulator ('\t') and some stops ('...') to separate the output.
print(nlp.vocab[pattern_name].text, '\t', doc[subject], '...', doc[verb], '...', doc[dobject])
# Alternative condition with just two items in the list.
else:
# Let's unpack the matches list into variables for clarity
verb, subject = matches[0], matches[1]
# Print the matches by first fetching the name of the pattern from the
# Vocabulary object. Next, use the 'subject' and 'verb' variables to
# index the Doc object. This gives us the actual Tokens matched. Use a
# tabulator ('\t') and some stops ('...') to separate the output.
print(nlp.vocab[pattern_name].text, '\t', doc[subject], '...', doc[verb])
nsubj_verb that ... expressed
nsubj_verb It ... aimed
nsubj_verb movement ... had
nsubj_verb groups ... had
nsubj_verb concerns ... included
nsubj_verb corporations ... control
nsubj_verb that ... benefited
nsubj_verb It ... formed
nsubj_verb Steger ... called
nsubj_verb Occupy ... began
nsubj_verb protests ... taken
nsubj_verb movement ... became
nsubj_verb protests ... started
nsubj_verb repression ... remained
nsubj_verb this ... began
nsubj_verb police ... attempted
nsubj_verb authorities ... cleared
nsubj_verb movement ... uses
nsubj_verb it ... organizes
nsubj_verb West ... described
nsubj_verb Director ... stated
nsubj_verb students ... occupied
nsubj_verb that ... resulted
nsubj_verb slogan ... emerged
nsubj_verb Post ... noted
nsubj_verb that ... read
nsubj_verb who ... designed
nsubj_verb White ... traveled
nsubj_verb He ... wrote
nsubj_verb movement ... began
nsubj_verb camping ... marked
nsubj_verb leader ... called
nsubj_verb Foundation ... proposed
nsubj_verb Lasn ... registered
nsubj_verb we ... floated
nsubj_verb it ... snowballed
nsubj_verb Village ... set
nsubj_verb protest ... received
nsubj_verb group ... encouraged
nsubj_verb They ... promoted
nsubj_verb It ... refers
nsubj_verb percent ... tripled
nsubj_verb incomes ... grew
nsubj_verb % ... saw
nsubj_verb income ... decreased
nsubj_verb that ... increased
nsubj_verb taxation ... became
nsubj_verb earners ... saw
nsubj_verb income ... increase
nsubj_verb % ... owned
nsubj_verb % ... owned
nsubj_verb % ... owned
nsubj_verb % ... owning
nsubj_verb which ... started
nsubj_verb share ... grew
nsubj_verb that ... grew
nsubj_verb Recession ... caused
nsubj_verb income ... grew
nsubj_verb % ... went
nsubj_verb who ... had
nsubj_verb that ... indicate
nsubj_verb Lasn ... said
nsubj_verb that ... allowed
nsubj_verb movement ... grow
nsubj_verb Adbusters ... trying
nsubj_verb Wolf ... argued
nsubj_verb Wolf ... argued
nsubj_verb they ... have
nsubj_verb they ... saw
nsubj_verb magazine ... stated
nsubj_verb protesters ... wanted
nsubj_verb commentators ... criticized
nsubj_verb movement ... defined
nsubj_verb they ... argued
nsubj_verb movement ... seeks
nsubj_verb contingent ... released
nsubj_verb they ... called
nsubj_verb it ... takes
nsubj_verb Occupy ... said
nsubj_verb they ... working
nsubj_verb that ... reflected
nsubj_verb Activists ... used
nsubj_verb Indymedia ... helped
nsubj_verb provider ... offered
nsubj_verb movement ... went
nsubj_verb Fund ... released
nsubj_verb that ... strip
nsubj_verb Homes ... embarked
nsubj_verb who ... lost
nsubj_verb they ... called
nsubj_verb that ... took
nsubj_verb group ... planned
nsubj_verb Much ... occurs
nsubj_verb This ... features
nsubj_verb who ... comment
nsubj_verb anyone ... join
nsubj_verb Street ... uses
nsubj_verb they ... belong
nsubj_verb women ... get
nsubj_verb males ... wait
nsubj_verb turn ... speak
nsubj_verb movement ... began
nsubj_verb which ... premiered
nsubj_verb Sharp ... warned
nsubj_verb movement ... employing
nsubj_verb he ... said
nsubj_verb protesters ... have
nsubj_verb they ... achieve
nsubj_verb they ... think
nsubj_verb they ... change
nsubj_verb Protest ... accomplishes
nsubj_verb Castells ... congratulated
nsubj_verb Castells ... said
nsubj_verb it ... help
nsubj_verb them ... gain
nsubj_verb they ... make
nsubj_verb Group ... endorsed
nsubj_verb occupiers ... upheld
nsubj_verb journalists ... saying
nsubj_verb branch ... accept
nsubj_verb who ... signed
nsubj_verb Klein ... congratulated
nsubj_verb sources ... began
nsubj_verb camps ... responded
nsubj_verb occupiers ... sign
nsubj_verb they ... wished
nsubj_verb Hampton ... said
nsubj_verb Barnett ... said
nsubj_verb nonviolence ... remained
nsubj_verb that ... saw
nsubj_verb protestors ... said
nsubj_verb police ... initiated
nsubj_verb others ... said
nsubj_verb they ... blamed
nsubj_verb who ... take
nsubj_verb protester ... stated
nsubj_verb I ... support
nsubj_verb who ... have
nsubj_verb I ... support
nsubj_verb I ... expressed
nsubj_verb movement ... relied
nsubj_verb accounts ... became
nsubj_verb Some ... believe
nsubj_verb that ... followed
nsubj_verb interests ... changed
nsubj_verb It ... showed
nsubj_verb ratio ... dropping
nsubj_verb Some ... find
nsubj_verb celebrities ... made
nsubj_verb West ... justified
nsubj_verb celebrities ... tweeted
nsubj_verb Moore ... tweeted
nsubj_verb Many ... hold
nsubj_verb success ... led
nsubj_verb Some ... believe
nsubj_verb people ... used
nsubj_verb WikiLeaks ... endorsed
nsubj_verb Central ... began
nsubj_verb editor ... modeled
nsubj_verb protests ... began
nsubj_verb activists ... repeated
nsubj_verb list ... included
nsubj_verb protesters ... gathered
nsubj_verb people ... stayed
nsubj_verb protesters ... started
nsubj_verb officers ... used
nsubj_verb which ... involves
nsubj_verb which ... showed
nsubj_verb attention ... resulted
nsubj_verb Haberman ... said
nsubj_verb protesters ... choose
nsubj_verb who ... gave
nsubj_verb they ... want
nsubj_verb protesters ... set
nsubj_verb Times ... reported
nsubj_verb Some ... said
nsubj_verb police ... tricked
nsubj_verb Myerson ... said
nsubj_verb cops ... watched
nsubj_verb spokesman ... said
nsubj_verb they ... refused
nsubj_verb group ... filed
nsubj_verb officers ... violated
nsubj_verb judge ... ruled
nsubj_verb protesters ... received
nsubj_verb evidence ... showed
nsubj_verb police ... warning
nsubj_verb Rakoff ... sided
nsubj_verb officer ... known
nsubj_verb horn ... communicate
nsubj_verb demonstration ... swelled
nsubj_verb marchers ... joining
nsubj_verb protests ... continued
nsubj_verb Thousands ... joined
nsubj_verb protesters ... marching
nsubj_verb scuffles ... erupted
nsubj_verb protesters ... tried
nsubj_verb Police ... responded
nsubj_verb protesters ... organized
nsubj_verb they ... saw
nsubj_verb One ... said
nsubj_verb Government ... made
nsubj_verb who ... caused
nsubj_verb crisis ... get
nsubj_verb people ... pay
nsubj_verb thousands ... staging
nsubj_verb people ... protested
nsubj_verb protesters ... carried
nsubj_verb We ... bail
nsubj_verb that ... drew
nsubj_verb protest ... turned
nsubj_verb Thousands ... gathered
nsubj_verb police ... cleared
nsubj_verb Jordan ... expressed
nsubj_verb police ... suffered
nsubj_verb who ... sought
nsubj_verb Olsen ... suffered
nsubj_verb protesters ... shut
nsubj_verb Police ... estimated
nsubj_verb 4,500 ... marched
nsubj_verb protesters ... held
nsubj_verb people ... took
nsubj_verb police ... removed
nsubj_verb authorities ... stepped
nsubj_verb Lambert ... suggested
nsubj_verb it ... disband
nsubj_verb Gapper ... offered
nsubj_verb Gapper ... said
nsubj_verb they ... beginning
nsubj_verb Pike ... used
nsubj_verb incident ... drew
nsubj_verb Katehi ... resign
nsubj_verb occupiers ... checked
nsubj_verb they ... received
nsubj_verb occupiers ... begun
nsubj_verb Bond ... found
nsubj_verb issues ... included
nsubj_verb Homes ... joined
nsubj_verb activists ... planted
nsubj_verb that ... lasted
nsubj_verb Post ... reported
nsubj_verb which ... disbanded
nsubj_verb some ... facing
nsubj_verb Nigeria ... began
nsubj_verb most ... took
nsubj_verb strikes ... shutting
nsubj_verb Jonathan ... responded
nsubj_verb he ... bring
nsubj_verb 2012 ... seen
nsubj_verb universities ... begun
nsubj_verb course ... includes
nsubj_verb students ... join
nsubj_verb teams ... planning
nsubj_verb LLC ... reached
nsubj_verb agreement ... resolved
nsubj_verb workers ... work
nsubj_verb This ... came
nsubj_verb which ... shut
nsubj_verb goals ... included
nsubj_verb poll ... found
nsubj_verb supporters ... outweighed
nsubj_verb Occupy ... protested
nsubj_verb Street ... attempted
nsubj_verb who ... made
nsubj_verb movement ... marked
nsubj_verb that ... took
nsubj_verb This ... included
nsubj_verb members ... gathered
nsubj_verb movement ... celebrated
nsubj_verb activists ... set
nsubj_verb activists ... claimed
nsubj_verb which ... prohibit
nsubj_verb occupiers ... claim
nsubj_verb beings ... need
nsubj_verb people ... protect
nsubj_verb activists ... said
nsubj_verb vigil ... continue
nsubj_verb Hales ... ordered
nsubj_verb movement ... transformed
nsubj_verb campaigns ... emerged
nsubj_verb campaigns ... include
nsubj_verb which ... provided
nsubj_verb Sandy ... hit
nsubj_verb that ... hosted
nsubj_verb which ... monitors
nsubj_verb which ... raising
nsubj_verb individual ... re
nsubj_verb which ... developed
nsubj_verb program ... worked
nsubj_verb hundreds ... protested
nsubj_verb supporters ... protesting
nsubj_verb Sanders ... received
nsubj_verb protestors ... claiming
nsubj_verb networks ... blacked
nsubj_verb spirit ... lives
nsubj_verb anarchism ... began
nsubj_verb Cafe ... continues
nsubj_verb Movement ... organized
nsubj_verb groups ... emerged
nsubj_verb group ... swarmed
nsubj_verb it ... shutdown
nsubj_verb hundreds ... took
nsubj_verb blockade ... caused
nsubj_verb building ... shutdown
nsubj_verb staff ... citing
nsubj_verb Feds ... ordered
nsubj_verb officers ... moved
nsubj_verb Kalamazoo ... began
nsubj_verb efforts ... received
nsubj_verb who ... criticized
nsubj_verb protesters ... faced
nsubj_verb demonstrations ... continuing
nsubj_verb leader ... named
nsubj_verb demonstrations ... took
nsubj_verb protesters ... defied
nsubj_verb Occupiers ... returned
nsubj_verb Sydney ... had
nsubj_verb it ... returned
nsubj_verb demonstration ... took
nsubj_verb movement ... had
nsubj_verb people ... attended
nsubj_verb Three ... took
nsubj_verb one ... took
nsubj_verb protests ... included
nsubj_verb protesters ... say
nsubj_verb Ghent ... began
nsubj_verb They ... received
nsubj_verb that ... advocates
nsubj_verb protests ... taken
nsubj_verb people ... gathered
nsubj_verb 150 ... stayed
nsubj_verb people ... marched
nsubj_verb 100 ... continued
nsubj_verb 1,000 ... gathered
nsubj_verb people ... occupied
nsubj_verb people ... occupied
nsubj_verb group ... occupied
nsubj_verb protestors ... began
nsubj_verb Party ... participated
nsubj_verb Police ... dissolved
nsubj_verb protesters ... started
nsubj_verb movement ... used
nsubj_verb protesters ... showed
nsubj_verb camp ... lived
nsubj_verb movement ... shifted
nsubj_verb protesters ... started
nsubj_verb relations ... varied
nsubj_verb police ... joined
nsubj_verb people ... joined
nsubj_verb protests ... begun
nsubj_verb students ... began
nsubj_verb it ... spread
nsubj_verb movement ... began
nsubj_verb Occupy ... took
nsubj_verb Berlin ... established
nsubj_verb protests ... took
nsubj_verb Police ... reported
nsubj_verb people ... protested
nsubj_verb people ... took
nsubj_verb organisers ... claimed
nsubj_verb HSBC ... filed
nsubj_verb Court ... ruled
nsubj_verb protesters ... leave
nsubj_verb people ... gathered
nsubj_verb protests ... occurred
nsubj_verb Laterano ... received
nsubj_verb protesters ... had
nsubj_verb fingers ... amputated
nsubj_verb people ... occupied
nsubj_verb movement ... held
nsubj_verb people ... took
nsubj_verb movement ... spread
nsubj_verb Occupy ... began
nsubj_verb government ... guarantee
nsubj_verb it ... came
nsubj_verb Police ... remained
nsubj_verb Mexico ... achieve
nsubj_verb it ... gained
nsubj_verb protesters ... failed
nsubj_verb protests ... occurred
nsubj_verb movement ... drew
nsubj_verb protesters ... remained
nsubj_verb Ganbaatar ... announced
nsubj_verb association ... joins
nsubj_verb He ... claimed
nsubj_verb bankers ... charging
nsubj_verb protesters ... gathered
nsubj_verb protesters ... created
nsubj_verb they ... presented
nsubj_verb demands ... investigate
nsubj_verb which ... took
nsubj_verb demands ... focused
nsubj_verb protests ... took
nsubj_verb protests ... began
nsubj_verb protest ... started
nsubj_verb they ... call
nsubj_verb police ... moved
nsubj_verb police ... said
nsubj_verb that ... started
nsubj_verb movement ... ended
nsubj_verb which ... saw
nsubj_verb it ... sells
nsubj_verb movement ... began
nsubj_verb movement ... met
nsubj_verb Times ... described
nsubj_verb group ... has
nsubj_verb who ... invited
nsubj_verb people ... took
nsubj_verb camp ... survived
nsubj_verb group ... occupying
nsubj_verb camp ... lasted
nsubj_verb It ... consists
nsubj_verb groups ... adopted
nsubj_verb Hundreds ... held
nsubj_verb Protesters ... focused
nsubj_verb which ... started
nsubj_verb One ... argued
nsubj_verb Korea ... overcame
nsubj_verb series ... demands
nsubj_verb protesters ... consider
nsubj_verb media ... related
nsubj_verb Movement ... drew
nsubj_verb protesters ... demonstrated
nsubj_verb protesters ... established
nsubj_verb protests ... developed
nsubj_verb which ... featured
nsubj_verb reaction ... caused
nsubj_verb protests ... widen
nsubj_verb people ... finding
nsubj_verb all ... finding
nsubj_verb What ... started
nsubj_verb Demands ... included
nsubj_verb protests ... spread
nsubj_verb protesters ... gathered
nsubj_verb Police ... sealed
nsubj_verb people ... gathered
nsubj_verb canon ... said
nsubj_verb people ... exercise
nsubj_verb protests ... occurred
nsubj_verb camps ... took
nsubj_verb protests ... focused
nsubj_verb Police ... arrested
nsubj_verb who ... occupying
nsubj_verb police ... arrested
nsubj_verb body ... occupied
nsubj_verb camp ... lasted
nsubj_verb police ... swept
nsubj_verb police ... dragging
nsubj_verb Police ... said
nsubj_verb protesters ... remained
nsubj_verb group ... says
nsubj_verb that ... challenge
nsubj_verb Belfast ... initiated
nsubj_verb Belfast ... took
nsubj_verb It ... took
nsubj_verb Derry ... take
nsubj_verb Coleraine ... took
nsubj_verb group ... protested
nsubj_verb Council ... backed
nsubj_verb Protesters ... set
nsubj_verb council ... obtained
nsubj_verb council ... agreed
nsubj_verb Cardiff ... set
nsubj_verb Cardiff ... set
nsubj_verb protests ... began
nsubj_verb movement ... rejects
nsubj_verb police ... discovered
nsubj_verb march ... resulted
nsubj_verb Police ... used
nsubj_verb march ... received
nsubj_verb protesters ... attempted
nsubj_verb Some ... said
nsubj_verb police ... tricked
nsubj_verb they ... began
nsubj_verb officers ... cleared
nsubj_verb Police ... fired
nsubj_verb organizers ... said
nsubj_verb Olsen ... suffered
nsubj_verb witnesses ... believed
nsubj_verb protesters ... shut
nsubj_verb Police ... estimated
nsubj_verb 4,500 ... marched
nsubj_verb police ... cleared
nsubj_verb journalists ... complained
nsubj_verb police ... made
nsubj_verb journalists ... responded
nsubj_verb they ... perceived
nsubj_verb that ... threaten
nsubj_verb McKenzie ... commented
nsubj_verb authorities ... honour
nsubj_verb Homes ... embarked
nsubj_verb they ... say
nsubj_verb who ... made
nsubj_verb that ... took
nsubj_verb movement ... took
nsubj_verb protesters ... returned
nsubj_verb Rousseff ... said
nsubj_verb We ... agree
nsubj_verb movements ... used
nsubj_verb we ... see
nsubj_verb Flaherty ... expressed
nsubj_verb He ... commented
nsubj_verb I ... understand
nsubj_verb Singh ... described
nsubj_verb Khamenei ... voiced
nsubj_verb it ... grow
nsubj_verb it ... bring
nsubj_verb Brown ... said
nsubj_verb who ... say
nsubj_verb we ... build
nsubj_verb people ... take
nsubj_verb they ... do
nsubj_verb they ... reflect
nsubj_verb He ... mentioned
nsubj_verb politics ... speaks
nsubj_verb Council ... set
nsubj_verb We ... regard
nsubj_verb Edinburgh ... stated
nsubj_verb States ... spoke
nsubj_verb authorities ... collaborated
nsubj_verb who ... participated
nsubj_verb authorities ... sent
nsubj_verb administration ... worked
nsubj_verb Venezuela ... condemned
nsubj_verb Affairs ... had
nsubj_verb Fukuyama ... argued
nsubj_verb he ... wrote
nsubj_verb populism ... taken
nsubj_verb survey ... suggested
nsubj_verb movement ... succeeded
nsubj_verb protesters ... lent
nsubj_verb message ... declared
nsubj_verb interests ... cater
nsubj_verb generation ... grown
nsubj_verb we ... have
nsubj_verb Branson ... said
nsubj_verb they ... protesting
nsubj_verb community ... takes
nsubj_verb they ... made
nsubj_verb Jackson ... said
nsubj_verb which ... sweeping
nsubj_verb survey ... found
nsubj_verb many ... reported
nsubj_verb who ... dislike
nsubj_verb Support ... varied
nsubj_verb Australia ... reporting
nsubj_verb impacts ... include
nsubj_verb protests ... helped
nsubj_verb Americans ... face
nsubj_verb that ... burdens
nsubj_verb movement ... appears
nsubj_verb print ... mentioned
nsubj_verb occupation ... began
nsubj_verb interest ... waned
nsubj_verb movement ... raised
nsubj_verb organizers ... consider
nsubj_verb unions ... become
nsubj_verb they ... employ
nsubj_verb protest ... provided
nsubj_verb Offshoots ... bought
nsubj_verb individuals ... owe
nsubj_verb they ... have
nsubj_verb Jubilee ... claims
nsubj_verb Chomsky ... argues
nsubj_verb movement ... created
nsubj_verb that ... exist
nsubj_verb people ... doing
nsubj_verb Jubilee ... reports
nsubj_verb it ... cleared
nsubj_verb Telegraph ... reported
nsubj_verb members ... voted
nsubj_verb shows ... using
nsubj_verb Office ... made
nsubj_verb City ... added
nsubj_verb Conan ... launched
nsubj_verb Times ... argued
nsubj_verb movement ... had
nsubj_verb commentators ... suggested
nsubj_verb movement ... had
nsubj_verb Economist ... reported
nsubj_verb protesters ... caused
nsubj_verb government ... pass
nsubj_verb banks ... claw
nsubj_verb Deutch ... introduced
nsubj_verb which ... overturn
nsubj_verb Gore ... called
nsubj_verb It ... works
nsubj_verb Mason ... said
nsubj_verb movement ... started
nsubj_verb it ... have
nsubj_verb journalists ... suggested
nsubj_verb Occupy ... influenced
nsubj_verb movement ... creating
nsubj_verb Inequality ... remained
nsubj_verb he ... mentions
nsubj_verb analysts ... say
nsubj_verb which ... reflects
nsubj_verb Occupy ... become
nsubj_verb inequality ... become
nsubj_verb Magazine ... declared
nsubj_verb protests ... began
nsubj_verb FBI ... formed
nsubj_verb Banks ... met
nsubj_verb FBI ... offered
nsubj_verb officials ... met
nsubj_verb officials ... met
nsubj_verb FBI ... used
nsubj_verb which ... gave
nsubj_verb DSAC ... coordinated
nsubj_verb organizations ... filed
nsubj_verb FBI ... withheld
nsubj_verb Shapiro ... sent
nsubj_verb FBI ... refused
nsubj_verb Shapiro ... filed
nsubj_verb document ... confirmed
nsubj_verb it ... opened
nsubj_verb critique ... concerns
nsubj_verb movement ... focused
nsubj_verb that ... differs
nsubj_verb dominance ... becomes
nsubj_verb that ... follow
nsubj_verb Practicality ... dominates
nsubj_verb that ... rationalize
nsubj_verb activists ... seen
nsubj_verb it ... stall
nsubj_verb Dean ... argues
nsubj_verb focus ... paved
nsubj_verb Emphasis ... encouraged
nsubj_verb Celebration ... heightened
nsubj_verb who ... emerged
nsubj_verb anarchism ... suggests
nsubj_verb It ... pushes
nsubj_verb who ... called
nsubj_verb Remarks ... sparked
nsubj_verb many ... observed
nsubj_verb protests ... included
nsubj_verb Jews ... control
nsubj_verb who ... running
nsubj_verb Foxman ... stated
nsubj_verb that ... deals
nsubj_verb you ... going
nsubj_verb that ... believe
nsubj_verb they ... expressing
nsubj_verb_dobj that ... expressed ... opposition
nsubj_verb_dobj movement ... had ... scopes
nsubj_verb_dobj groups ... had ... focuses
nsubj_verb_dobj corporations ... control ... world
nsubj_verb_dobj that ... benefited ... minority
nsubj_verb_dobj It ... formed ... part
nsubj_verb_dobj Steger ... called ... what
nsubj_verb_dobj protests ... taken ... place
nsubj_verb_dobj authorities ... cleared ... most
nsubj_verb_dobj movement ... uses ... slogan
nsubj_verb_dobj West ... described ... which
nsubj_verb_dobj students ... occupied ... buildings
nsubj_verb_dobj slogan ... emerged ... that
nsubj_verb_dobj who ... designed ... concept
nsubj_verb_dobj camping ... marked ... start
nsubj_verb_dobj Foundation ... proposed ... occupation
nsubj_verb_dobj Lasn ... registered ... address
nsubj_verb_dobj we ... floated ... idea
nsubj_verb_dobj protest ... received ... attention
nsubj_verb_dobj group ... encouraged ... followers
nsubj_verb_dobj They ... promoted ... protest
nsubj_verb_dobj percent ... tripled ... income
nsubj_verb_dobj % ... saw ... rise
nsubj_verb_dobj % ... owned ... %
nsubj_verb_dobj % ... owned ... %
nsubj_verb_dobj % ... owning ... %
nsubj_verb_dobj Recession ... caused ... drop
nsubj_verb_dobj who ... had ... share
nsubj_verb_dobj that ... indicate ... distribution
nsubj_verb_dobj they ... have ... demands
nsubj_verb_dobj they ... saw ... what
nsubj_verb_dobj protesters ... wanted ... jobs
nsubj_verb_dobj protesters ... wanted ... distribution
nsubj_verb_dobj commentators ... criticized ... idea
nsubj_verb_dobj movement ... defined ... demands
nsubj_verb_dobj contingent ... released ... statement
nsubj_verb_dobj that ... reflected ... voices
nsubj_verb_dobj Activists ... used ... technologies
nsubj_verb_dobj Indymedia ... helped ... movement
nsubj_verb_dobj provider ... offered ... memberships
nsubj_verb_dobj Fund ... released ... bill
nsubj_verb_dobj that ... strip ... corporations
nsubj_verb_dobj they ... called ... what
nsubj_verb_dobj that ... took ... advantage
nsubj_verb_dobj This ... features ... use
nsubj_verb_dobj anyone ... join ... that
nsubj_verb_dobj protesters ... have ... objective
nsubj_verb_dobj protesters ... have ... something
nsubj_verb_dobj they ... change ... system
nsubj_verb_dobj Castells ... congratulated ... occupiers
nsubj_verb_dobj them ... gain ... coverage
nsubj_verb_dobj Group ... endorsed ... diversity
nsubj_verb_dobj occupiers ... upheld ... commitment
nsubj_verb_dobj branch ... accept ... protestors
nsubj_verb_dobj Klein ... congratulated ... occupiers
nsubj_verb_dobj occupiers ... sign ... resolution
nsubj_verb_dobj that ... saw ... arrests
nsubj_verb_dobj police ... initiated ... violence
nsubj_verb_dobj they ... blamed ... anarchists
nsubj_verb_dobj who ... take ... part
nsubj_verb_dobj I ... support ... idea
nsubj_verb_dobj who ... have ... housing
nsubj_verb_dobj I ... support ... it
nsubj_verb_dobj It ... showed ... %
nsubj_verb_dobj celebrities ... made ... appearances
nsubj_verb_dobj West ... justified ... appearance
nsubj_verb_dobj people ... used ... hashtag
nsubj_verb_dobj editor ... modeled ... concept
nsubj_verb_dobj activists ... repeated ... calls
nsubj_verb_dobj list ... included ... cities
nsubj_verb_dobj officers ... used ... technique
nsubj_verb_dobj who ... gave ... boost
nsubj_verb_dobj police ... tricked ... protesters
nsubj_verb_dobj group ... filed ... lawsuit
nsubj_verb_dobj officers ... violated ... rights
nsubj_verb_dobj protesters ... received ... warning
nsubj_verb_dobj protesters ... received ... entrance
nsubj_verb_dobj evidence ... showed ... warning
nsubj_verb_dobj police ... warning ... protesters
nsubj_verb_dobj horn ... communicate ... message
nsubj_verb_dobj marchers ... joining ... protest
nsubj_verb_dobj protesters ... organized ... occupation
nsubj_verb_dobj they ... saw ... what
nsubj_verb_dobj people ... pay ... price
nsubj_verb_dobj people ... pay ... those
nsubj_verb_dobj thousands ... staging ... demonstrations
nsubj_verb_dobj protesters ... carried ... banners
nsubj_verb_dobj We ... bail ... you
nsubj_verb_dobj that ... drew ... thousands
nsubj_verb_dobj Jordan ... expressed ... pleasure
nsubj_verb_dobj police ... suffered ... injuries
nsubj_verb_dobj Olsen ... suffered ... fracture
nsubj_verb_dobj protesters ... shut ... Port
nsubj_verb_dobj protesters ... held ... Day
nsubj_verb_dobj people ... took ... money
nsubj_verb_dobj police ... removed ... tents
nsubj_verb_dobj authorities ... stepped ... action
nsubj_verb_dobj Gapper ... offered ... view
nsubj_verb_dobj incident ... drew ... attention
nsubj_verb_dobj occupiers ... checked ... Obama
nsubj_verb_dobj issues ... included ... rate
nsubj_verb_dobj activists ... planted ... table
nsubj_verb_dobj which ... disbanded ... camps
nsubj_verb_dobj some ... facing ... challenges
nsubj_verb_dobj most ... took ... place
nsubj_verb_dobj strikes ... shutting ... cities
nsubj_verb_dobj he ... bring ... prices
nsubj_verb_dobj course ... includes ... work
nsubj_verb_dobj teams ... planning ... visits
nsubj_verb_dobj LLC ... reached ... agreement
nsubj_verb_dobj agreement ... resolved ... dispute
nsubj_verb_dobj which ... shut ... ports
nsubj_verb_dobj goals ... included ... support
nsubj_verb_dobj supporters ... outweighed ... those
nsubj_verb_dobj who ... made ... arrests
nsubj_verb_dobj movement ... marked ... resurgence
nsubj_verb_dobj that ... took ... place
nsubj_verb_dobj This ... included ... revival
nsubj_verb_dobj movement ... celebrated ... anniversary
nsubj_verb_dobj activists ... set ... table
nsubj_verb_dobj activists ... claimed ... laws
nsubj_verb_dobj which ... prohibit ... use
nsubj_verb_dobj people ... protect ... themselves
nsubj_verb_dobj Hales ... ordered ... removal
nsubj_verb_dobj campaigns ... include ... Sandy
nsubj_verb_dobj campaigns ... include ... Occupy
nsubj_verb_dobj which ... provided ... relief
nsubj_verb_dobj which ... monitors ... matters
nsubj_verb_dobj which ... raising ... money
nsubj_verb_dobj supporters ... protesting ... coverage
nsubj_verb_dobj networks ... blacked ... campaign
nsubj_verb_dobj Movement ... organized ... phase
nsubj_verb_dobj group ... swarmed ... facility
nsubj_verb_dobj hundreds ... took ... portion
nsubj_verb_dobj staff ... citing ... concerns
nsubj_verb_dobj Feds ... ordered ... protestors
nsubj_verb_dobj Kalamazoo ... began ... encampment
nsubj_verb_dobj efforts ... received ... support
nsubj_verb_dobj who ... criticized ... colleagues
nsubj_verb_dobj protesters ... faced ... violence
nsubj_verb_dobj leader ... named ... it
nsubj_verb_dobj demonstrations ... took ... place
nsubj_verb_dobj demonstrations ... took ... towns
nsubj_verb_dobj protesters ... defied ... orders
nsubj_verb_dobj Sydney ... had ... occupation
nsubj_verb_dobj demonstration ... took ... place
nsubj_verb_dobj movement ... had ... gathering
nsubj_verb_dobj people ... attended ... corner
nsubj_verb_dobj Three ... took ... place
nsubj_verb_dobj one ... took ... place
nsubj_verb_dobj protests ... included ... camping
nsubj_verb_dobj They ... received ... visit
nsubj_verb_dobj protests ... taken ... place
nsubj_verb_dobj people ... occupied ... Park
nsubj_verb_dobj people ... occupied ... front
nsubj_verb_dobj group ... occupied ... Park
nsubj_verb_dobj Police ... dissolved ... camp
nsubj_verb_dobj protesters ... started ... Camp
nsubj_verb_dobj movement ... used ... OccupyBufferZ
nsubj_verb_dobj police ... joined ... them
nsubj_verb_dobj people ... joined ... occupation
nsubj_verb_dobj Occupy ... took ... residence
nsubj_verb_dobj Berlin ... established ... camp
nsubj_verb_dobj protests ... took ... place
nsubj_verb_dobj HSBC ... filed ... lawsuit
nsubj_verb_dobj protesters ... leave ... area
nsubj_verb_dobj Laterano ... received ... damage
nsubj_verb_dobj people ... occupied ... Croce
nsubj_verb_dobj movement ... held ... assembly
nsubj_verb_dobj people ... took ... part
nsubj_verb_dobj government ... guarantee ... access
nsubj_verb_dobj Mexico ... achieve ... level
nsubj_verb_dobj movement ... drew ... thousands
nsubj_verb_dobj bankers ... charging ... rates
nsubj_verb_dobj protesters ... created ... set
nsubj_verb_dobj they ... presented ... which
nsubj_verb_dobj which ... took ... place
nsubj_verb_dobj protests ... took ... place
nsubj_verb_dobj they ... call ... what
nsubj_verb_dobj which ... saw ... restoration
nsubj_verb_dobj Times ... described ... movement
nsubj_verb_dobj group ... has ... structure
nsubj_verb_dobj who ... invited ... Democracy
nsubj_verb_dobj people ... took ... part
nsubj_verb_dobj group ... occupying ... amenity
nsubj_verb_dobj groups ... adopted ... Occupy4FreeEducation
nsubj_verb_dobj Hundreds ... held ... rallies
nsubj_verb_dobj Korea ... overcame ... crisis
nsubj_verb_dobj series ... demands ... change
nsubj_verb_dobj media ... related ... protests
nsubj_verb_dobj media ... related ... generation
nsubj_verb_dobj Movement ... drew ... inspiration
nsubj_verb_dobj protesters ... established ... occupation
nsubj_verb_dobj which ... featured ... use
nsubj_verb_dobj people ... finding ... reason
nsubj_verb_dobj all ... finding ... reason
nsubj_verb_dobj Demands ... included ... end
nsubj_verb_dobj Police ... sealed ... entrance
nsubj_verb_dobj people ... exercise ... right
nsubj_verb_dobj camps ... took ... place
nsubj_verb_dobj Police ... arrested ... members
nsubj_verb_dobj who ... occupying ... hotel
nsubj_verb_dobj police ... arrested ... people
nsubj_verb_dobj body ... occupied ... Tower
nsubj_verb_dobj police ... dragging ... protesters
nsubj_verb_dobj that ... challenge ... system
nsubj_verb_dobj Belfast ... initiated ... protest
nsubj_verb_dobj Belfast ... took ... residence
nsubj_verb_dobj It ... took ... control
nsubj_verb_dobj Derry ... take ... place
nsubj_verb_dobj Coleraine ... took ... University
nsubj_verb_dobj Coleraine ... took ... Room
nsubj_verb_dobj group ... protested ... demolition
nsubj_verb_dobj Council ... backed ... Edinburgh
nsubj_verb_dobj council ... obtained ... order
nsubj_verb_dobj Cardiff ... set ... site
nsubj_verb_dobj Cardiff ... set ... camp
nsubj_verb_dobj movement ... rejects ... institutions
nsubj_verb_dobj police ... discovered ... site
nsubj_verb_dobj Police ... used ... technique
nsubj_verb_dobj Police ... used ... use
nsubj_verb_dobj march ... received ... coverage
nsubj_verb_dobj police ... tricked ... protesters
nsubj_verb_dobj officers ... cleared ... sites
nsubj_verb_dobj Police ... fired ... canisters
nsubj_verb_dobj Olsen ... suffered ... fracture
nsubj_verb_dobj protesters ... shut ... Port
nsubj_verb_dobj police ... cleared ... encampment
nsubj_verb_dobj police ... made ... decision
nsubj_verb_dobj they ... perceived ... what
nsubj_verb_dobj that ... threaten ... protections
nsubj_verb_dobj authorities ... honour ... obligation
nsubj_verb_dobj who ... made ... billions
nsubj_verb_dobj that ... took ... advantage
nsubj_verb_dobj movement ... took ... crisis
nsubj_verb_dobj movements ... used ... that
nsubj_verb_dobj movements ... used ... demonstrations
nsubj_verb_dobj Flaherty ... expressed ... sympathy
nsubj_verb_dobj I ... understand ... frustration
nsubj_verb_dobj Singh ... described ... protests
nsubj_verb_dobj Khamenei ... voiced ... support
nsubj_verb_dobj Khamenei ... voiced ... Kingdom
nsubj_verb_dobj it ... bring ... system
nsubj_verb_dobj we ... build ... system
nsubj_verb_dobj people ... take ... risks
nsubj_verb_dobj they ... reflect ... crisis
nsubj_verb_dobj Council ... set ... precedent
nsubj_verb_dobj We ... regard ... this
nsubj_verb_dobj Venezuela ... condemned ... repression
nsubj_verb_dobj Affairs ... had ... articles
nsubj_verb_dobj populism ... taken ... form
nsubj_verb_dobj protesters ... lent ... support
nsubj_verb_dobj we ... have ... future
nsubj_verb_dobj community ... takes ... some
nsubj_verb_dobj they ... made ... difference
nsubj_verb_dobj which ... sweeping ... nation
nsubj_verb_dobj many ... reported ... response
nsubj_verb_dobj who ... dislike ... it
nsubj_verb_dobj Australia ... reporting ... lowest
nsubj_verb_dobj impacts ... include ... following
nsubj_verb_dobj that ... burdens ... class
nsubj_verb_dobj print ... mentioned ... inequality
nsubj_verb_dobj movement ... raised ... awareness
nsubj_verb_dobj organizers ... consider ... what
nsubj_verb_dobj organizers ... consider ... wealth
nsubj_verb_dobj protest ... provided ... hundreds
nsubj_verb_dobj Offshoots ... bought ... millions
nsubj_verb_dobj individuals ... owe ... that
nsubj_verb_dobj they ... have ... means
nsubj_verb_dobj movement ... created ... something
nsubj_verb_dobj people ... doing ... things
nsubj_verb_dobj it ... cleared ... million
nsubj_verb_dobj shows ... using ... term
nsubj_verb_dobj shows ... using ... %
nsubj_verb_dobj Office ... made ... references
nsubj_verb_dobj City ... added ... word
nsubj_verb_dobj Conan ... launched ... contest
nsubj_verb_dobj movement ... had ... impact
nsubj_verb_dobj movement ... had ... support
nsubj_verb_dobj government ... pass ... laws
nsubj_verb_dobj which ... overturn ... decision
nsubj_verb_dobj it ... have ... profile
nsubj_verb_dobj Occupy ... influenced ... State
nsubj_verb_dobj movement ... creating ... space
nsubj_verb_dobj he ... mentions ... movement
nsubj_verb_dobj which ... reflects ... fact
nsubj_verb_dobj Magazine ... declared ... Triumph
nsubj_verb_dobj FBI ... formed ... Council
nsubj_verb_dobj FBI ... offered ... plans
nsubj_verb_dobj FBI ... used ... informants
nsubj_verb_dobj FBI ... used ... information
nsubj_verb_dobj which ... gave ... updates
nsubj_verb_dobj organizations ... filed ... suits
nsubj_verb_dobj FBI ... withheld ... documents
nsubj_verb_dobj Shapiro ... sent ... requests
nsubj_verb_dobj FBI ... refused ... request
nsubj_verb_dobj Shapiro ... filed ... complaint
nsubj_verb_dobj document ... confirmed ... plot
nsubj_verb_dobj it ... opened ... investigation
nsubj_verb_dobj critique ... concerns ... itself
nsubj_verb_dobj movement ... focused ... demands
nsubj_verb_dobj that ... differs ... little
nsubj_verb_dobj focus ... paved ... way
nsubj_verb_dobj Emphasis ... encouraged ... people
nsubj_verb_dobj Celebration ... heightened ... skepticism
nsubj_verb_dobj It ... pushes ... us
nsubj_verb_dobj Remarks ... sparked ... criticism
nsubj_verb_dobj protests ... included ... slogans
nsubj_verb_dobj Jews ... control ... Street
nsubj_verb_dobj who ... running ... banks
As the output shows, the pattern nsubj_verb_dobj
returns both nominal subjects and direct objects of the verbs, which we defined using different “chains” of the pattern.
We could easily add another chain to the anchor pattern, for example, to search for prepositional phrases, or add further links to either of the existing chains to search for some more fine-grained features.
Examining matches in context using concordances#
We can examine matches in their context of occurrence using concordances. In corpus linguistics, concordances are often understood as lines of text that show a match in its context of occurrence.
These concordance lines can help understand why and how a particular token or structure is used in given context.
To create concordance lines using spaCy, let’s start by importing the Printer class from wasabi, which is a small Python library that spaCy uses for colouring and formatting messages. We will use wasabi to highlight the matches in the concordance lines.
We first initialise a Printer object, which we then assign under the variable match
. Next, we test the Printer object by printing some text in red colour.
# Import the Printer class from wasabi
from wasabi import Printer
# Initialise a Printer object; assign the object under the variable 'match'
match = Printer()
# Use the Printer to print out some text in red colour
match.text('Hello world!', color='red')
Hello world!
We then proceed to loop over the results returned by the Matcher object morph_matcher
. As we learned above, the results consist of Span objects in a list, which are stored under the variable morph_results
.
We loop over items in this list and use the enumerate()
function to keep track of their count. We also provide the argument start
with the value 1 to the enumerate()
function to start counting from the number 1.
During the loop, we refer to this count using the variable i
and to the Span object as result
. The number under i
is incremented with every Span object.
We then print out the following output for each Span object in the list morph_results
:
i
: The number of the item in the list.doc[result.start - 7: result.start]
: A slice of the Doc object stored under the variabledoc
, which we searched for matches. As usual, we define a slice using brackets and separate the start and end of a slice using a colon. We take a slice that begins 7 Tokens before the start of the match (result.start - 7
), and terminates at the start of the matchresult.start
.match.text(result, color="red", no_print=True)
: The matching Span object, rendered using the wasabi Printer objectmatch
in red colour. We also set the argumentno_print
toTrue
to prevent wasabi from printing the output on a new line.doc[result.end: result.end + 7]
: Another slice of the Doc object stored under the variabledoc
. Here we take a slice that begins at the end of the matchresult.end
and terminates 7 Tokens after the end of the match (result.end + 7
).
Essentially, we use the indices available under start
and end
attributes of each Span to retrieve the linguistic context in which the Span occurs.
# Loop over the matches in 'morph_results' and keep count of items
for i, result in enumerate(morph_results, start=1):
# Print following information for each match
print(i, # Item number being looped over
doc[result.start - 7: result.start], # The slice of the Doc preceding the match
match.text(result, color='red', no_print=True), # The match, rendered in red colour using wasabi
doc[result.end: result.end + 7] # The slice of the Doc following the match
)
1 unity among the "99%".The Community Environmental Legal Defense Fund released a model community bill of rights,
2 raid was chaotic and violent, but Oakland Police Chief Howard Jordan expressed his pleasure concerning the operation because neither
3 process.In March 2012, former U.S. Vice President Al Gore called on activists to "occupy democracy"
4 few demands. On 12 October 2011 Los Angeles City Council became one of the first governmental bodies in
5 by bullhorn, after reviewing it, Judge Jed S. Rakoff sided with plaintiffs, saying, "a
6 other countries."
Canada— Finance Minister Jim Flaherty expressed sympathy with the protests, stating "
7 of that."
India— Prime Minister Manmohan Singh described the protests as "a warning for
8 of governance".
Iran— Supreme Leader Ayatollah Khamenei voiced his support for the Occupy Movement saying
9 —On 21 October 2011, former Prime Minister Gordon Brown said the protests were about fairness. "
10 Abraham Foxman, national director of the Anti-Defamation League stated that "it's not surprising that
11 . The Direct Action Working Group of Occupy Wall Street endorsed diversity of tactics from the earliest days
12 march across the Brooklyn Bridge. The New York Times reported that more than 700 arrests were made
13 A. Myerson, a media coordinator for Occupy Wall Street said , "The cops watched and did
14 on 18 November 2011, campus police Lieutenant John Pike used pepper spray on seated students. The
15 Economic Forum. On 17 March, Occupy Wall Street attempted to mark six months of the movement
16 clock until 23 July 2013, when Mayor Charlie Hales ordered the removal of the vigil and associated
17 The Roman Catholic church Santi Marcellino e Pietro al Laterano received extensive damage, including a statue of
18 an environmentalist protest against plans to replace Taksim Gezi Park developed into wider anti-government demonstrations.
19 Executive Director Alison Bethel McKenzie of the International Press Institute commented : "It is completely unacceptable to
20 of the occupation.
Brazil— President Dilma Rousseff said , "We agree with some of
21 . On Saturday 26 November 2011, Edinburgh City Council set a worldwide precedent by voting in favour
22 Occupy Edinburgh.
United States— President Barack Obama spoke in support of the movement, but
23 City, Portland, Oakland, and New York City sent in police to crack down on the
24 removed by police.
Venezuela— President Hugo Chávez condemned the "horrible repression" of the
25 In January 2012, members of the American Dialect Society voted with an overwhelming majority for "Occupy
26 instability. It formed part of what Manfred Steger called the "global justice movement".The
27 Washington Post, the movement, which Cornel West described as a "democratic awakening",
28 Nothing' first emerged." The Huffington Post noted that "During one incident in March
29 financial crisis. Adbusters co-founder Kalle Lasn registered the OccupyWallStreet.org web address on 9 June
30 the inspirations for the movement was the Democracy Village set up in 2010, outside the British
31 Hood tax planned for 29 October. Naomi Wolf argued that the impression created by much of
32 Some commentators such as David Graeber and Judith Butler criticized the idea that the movement must have
33 . The progressive provider May First/ People Link offered cost-free memberships for dozens of
34 ."In late May 2011, sociologist Manuel Castells congratulated Spanish occupiers for the fact that not
35 female protestors. In early October, Naomi Klein congratulated New York occupiers for their commitment to
36 wished to stay. Rick Hampton for USA Today said the vast majority of occupy members have
37 the global movement in December 2011, Anthony Barnett said its nonviolence remained an immense strength.
38 the Occupy Wall Street Movement, but Kanye West justified his appearance as helping give power back
39 Yoko Ono, Mark Ruffalo, and Michael Moore tweeted and showed their support.
Many
40 .
The WikiLeaks endorsed news site WikiLeaks Central began promoting the idea of a "US
41 , and the American WikiLeaks Central writer Alexa O'Brien modeled the concept after the Day of Rages
42 a forcible eviction. Financial Times editor Richard Lambert suggested that the shift to confrontational tactics by
43 of the movement, Financial Times journalist Shannon Bond found that issues of concern included: "
44 18 months. On 22 December The Washington Post reported that some of the cities which had
45 .
On 2 January 2012, Occupy Nigeria began , sparked by Nigeria's President Goodluck
46 shutting down whole cities. On 16 January Jonathan responded by announcing he would bring prices back
47 relief to the New York area since Hurricane Sandy hit , Occupy London's Occupy Economics group
48 April 2016, hundreds of supporters of Bernie Sanders protested outside of CNN's Headquarters in Los
49 hiatus in activism on location, the Occupy Movement organized the Occupy ICE phase in order to
50 .On August 19, 2018, Occupy Kalamazoo began an encampment in Bronson Park to address
51 -occupying multiple sites since.
Occupy Sydney had an ongoing occupation in Martin Place since
52 Klárov" in Prague was started. Pirate Party participated in the occupation. Police dissolved the
53 ", a permanent occupation of the United Nations controlled buffer zone in the centre of the
54 of the European Central Bank, and Occupy Berlin established a protest camp at St. Mary's
55 . On 13 August 2012, the High Court ruled that the protesters must leave the occupied
56 Cork, Limerick and Galway. The Irish Times described the movement in the following terms:
57 protest, many of the catchphrases of Occupy Seoul contained anti-government or anti-American
58 of the observers has argued that " South Korea overcame the 2008 financial crisis relatively well and
59 riots in 2009. The 15- M Movement drew inspiration from 2011 revolutions in Tunisia,
60 Wales. On 8 January 2012, Lancaster Police arrested four members of Occupy Lancaster who were
61 ".
In Northern Ireland, Occupy Belfast initiated its protest outside the offices of Invest
62 Invest NI on 21 October 2011. Occupy Belfast took residence at Writer's Square, in
63 place in the near future.
Occupy Coleraine took over the University of Ulster Common Room
64 the Occupy movement worldwide. Protesters from Occupy Glasgow set up in the civic George Square on
65 and demonstrations outside Cardiff magistrates court. Occupy Cardiff set up a new camp in the city
66 the January/February 2012 issue, Francis Fukuyama argued that the Occupy movement was not as
67 survey for the think tank Center for American Progress suggested that the Occupy movement has succeeded in
68 In early December 2011, business magnate Richard Branson said the movement is a "good start
69 difference.On 15 December 2011, Jesse Jackson said that Jesus Christ, Gandhi, and
70 .On 10 November 2011, The Daily Telegraph reported that the word "occupy" had
71
On 27 December 2011, the Financial Times argued that the movement had had a global
72 ." Also in November 2011, Paul Mason said that the Occupy movement had started to
73 part of the political discourse and The Atlantic Magazine declared "The Triumph of Occupy Wall Street
74 of Financial Stability at the Bank of England stated that the protesters were right to criticise
75 2010, students across the University of California occupied campus buildings in protest against budget cuts
76 some journalists and commentators the camping in Spain marked the start of the global occupy movement
77 additional attention when the internet hacker group Anonymous encouraged its followers to take part in the
78 the top 400 income earners in the U.S. saw their income increase 392% and their
79 not have clear demands was false. Wolf argued that they did have clear demands including
80 , and Meetup to coordinate events. Indymedia helped the movement with communications, saying there
81 It showed 40% of users produced Occupy related content during peak activity of the movement
82 presidential candidates over others.
The WikiLeaks endorsed news site WikiLeaks Central began promoting the
83 . A list of events for 15 October included 951 cities in 82 countries. On
84 FT, offered a different view. Gapper said that it may be advantageous that the
85 to one. In late January, Occupy protested at the World Economic Forum. On
86 concerns'. On 25 June, Feds ordered the protestors to vacate government environs or
87 fact that the protesters were peaceful, HSBC filed a lawsuit for their eviction. On
88 Italian cities the same day. In Rome masked and hooded militants wearing makeshift body armor
89 Kelantan with Occupy Kota Bharu.
Occupy began in Mexico City on 11 October 2011
90 Nigeria.
The Occupy movement in Norway began on 15 October with protests in Oslo
91 's Time for Outrage!, the NEET troubled generation and current protests in the Middle
92 government demonstrations. Demands issued on 4 June included
the end of police brutality,
93 in July 2012, the City of Vancouver added the word to its list of reserve
94 In December 2012, the Television show Conan launched a contest called "Occupy Conan"
95 President Joe Biden, have suggested that Occupy influenced the President's January 2012 State of
96 collected by corporate security, and the FBI offered to bank officials its plans to prevent
97 Zions Bank about planned protests. The FBI used informants to infiltrate and monitor protests;
98 with private corporate security officials. The FBI withheld documents requested under the FOIA citing the
99 suppressed sniper rifles". When the FBI refused the request, Shapiro filed a federal
100 When the FBI refused the request, Shapiro filed a federal complaint in Washington, D.C.
This returns a set of concordance lines highlighting the matches in their context of occurrence.
Note that in some cases, the preceding or following Tokens consist of line breaks indicating a paragraph break, which causes the output to jump a row or two.
This section should have given you an idea of how to search linguistic annotations for matching structures using spaCy.
In the following section, you will be introduced to word embeddings, a technique for approximating the meaning of words.