A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://pypi.python.org/pypi/topia.termextract/ below:

topia.termextract ยท PyPI

This package determines important terms within a given piece of content. It uses linguistic tools such as Parts-Of-Speech (POS) and some simple statistical analysis to determine the terms and their strength.

Detailed Documentation An Exmaple - A News Article

This document provides a simple example of extracting the terms of a BBC article from May 29, 2009. We will use several term extraction tools to compare the outcome.

>>> text ='''
... Police shut Palestinian theatre in Jerusalem.
...
... Israeli police have shut down a Palestinian theatre in East Jerusalem.
...
... The action, on Thursday, prevented the closing event of an international
... literature festival from taking place.
...
... Police said they were acting on a court order, issued after intelligence
... indicated that the Palestinian Authority was involved in the event.
...
... Israel has occupied East Jerusalem since 1967 and has annexed the
... area. This is not recognised by the international community.
...
... The British consul-general in Jerusalem , Richard Makepeace, was
... attending the event.
...
... "I think all lovers of literature would regard this as a very
... regrettable moment and regrettable decision," he added.
...
... Mr Makepeace said the festival's closing event would be reorganised to
... take place at the British Council in Jerusalem.
...
... The Israeli authorities often take action against events in East
... Jerusalem they see as connected to the Palestinian Authority.
...
... Saturday's opening event at the same theatre was also shut down.
...
... A police notice said the closure was on the orders of Israel's internal
... security minister on the grounds of a breach of interim peace accords
... from the 1990s.
...
... These laid the framework for talks on establishing a Palestinian state
... alongside Israel, but left the status of Jerusalem to be determined by
... further negotiation.
...
... Israel has annexed East Jerusalem and declares it part of its eternal
... capital.
...
... Palestinians hope to establish their capital in the area.
... '''
TreeTagger

A POS tagger that uses some linguistics to tag a text. Here is its output:

Police          NNS       Police
shut            VVD       shut
Palestinian     JJ        Palestinian
theatre         NN        theatre
in              IN        in
Jerusalem       NP        Jerusalem
.               SENT      .
Israeli         JJ        Israeli
police          NNS       police
have            VHP       have
shut            VVN       shut
down            RP        down
a               DT        a
Palestinian     JJ        Palestinian
theatre         NN        theatre
in              IN        in
East            NP        East
Jerusalem       NP        Jerusalem
.               SENT      .
The             DT        the
action          NN        action
,               ,         ,
on              IN        on
Thursday        NP        Thursday
,               ,         ,
prevented       VVD       prevent
the             DT        the
closing         NN        closing
event           NN        event
of              IN        of
an              DT        an
international   JJ        international
literature      NN        literature
festival        NN        festival
from            IN        from
taking          VVG       take
place           NN        place
.               SENT      .
Police          NNS       Police
said            VVD       say
they            PP        they
were            VBD       be
acting          VVG       act
on              IN        on
a               DT        a
court           NN        court
order           NN        order
,               ,         ,
issued          VVN       issue
after           IN        after
intelligence    NN        intelligence
indicated       VVN       indicate
that            IN        that
the             DT        the
Palestinian     NP        Palestinian
Authority       NP        Authority
was             VBD       be
involved        VVN       involve
in              IN        in
the             DT        the
event           NN        event
.               SENT      .
Israel          NP        Israel
has             VHZ       have
occupied        VVN       occupy
East            NP        East
Jerusalem       NP        Jerusalem
since           IN        since
1967            CD        @card@
and             CC        and
has             VHZ       have
annexed         VVN       annex
the             DT        the
area            NN        area
.               SENT      .
This            DT        this
is              VBZ       be
not             RB        not
recognised      VVN       recognise
by              IN        by
the             DT        the
international   JJ        international
community       NN        community
.               SENT      .
The             DT        the
British         JJ        British
consul-general  NN        <unknown>
in              IN        in
Jerusalem       NP        Jerusalem
,               ,         ,
Richard         NP        Richard
Makepeace       NP        Makepeace
,               ,         ,
was             VBD       be
attending       VVG       attend
the             DT        the
event           NN        event
.               SENT      .
"               ``        "
I               PP        I
think           VVP       think
all             DT        all
lovers          NNS       lover
of              IN        of
literature      NN        literature
would           MD        would
regard          VV        regard
this            DT        this
as              IN        as
a               DT        a
very            RB        very
regrettable     JJ        regrettable
moment          NN        moment
and             CC        and
regrettable     JJ        regrettable
decision        NN        decision
,               ,         ,
"               ''        "
he              PP        he
added           VVD       add
.               SENT      .
Mr              NP        Mr
Makepeace       NP        Makepeace
said            VVD       say
the             DT        the
festival        NN        festival
's              POS       's
closing         NN        closing
event           NN        event
would           MD        would
be              VB        be
reorganised     VVN       <unknown>
to              TO        to
take            VV        take
place           NN        place
at              IN        at
the             DT        the
British         NP        British
Council         NP        Council
in              IN        in
Jerusalem       NP        Jerusalem
.               SENT      .
The             DT        the
Israeli         JJ        Israeli
authorities     NNS       authority
often           RB        often
take            VVP       take
action          NN        action
against         IN        against
events          NNS       event
in              IN        in
East            NP        East
Jerusalem       NP        Jerusalem
they            PP        they
see             VVP       see
as              RB        as
connected       VVN       connect
to              TO        to
the             DT        the
Palestinian     JJ        Palestinian
Authority       NP        Authority
.               SENT      .
Saturday        NP        Saturday
's              POS       's
opening         NN        opening
event           NN        event
at              IN        at
the             DT        the
same            JJ        same
theatre         NN        theatre
was             VBD       be
also            RB        also
shut            VVN       shut
down            RP        down
.               SENT      .
A               DT        a
police          NN        police
notice          NN        notice
said            VVD       say
the             DT        the
closure         NN        closure
was             VBD       be
on              IN        on
the             DT        the
orders          NNS       order
of              IN        of
Israel          NP        Israel
's              POS       's
internal        JJ        internal
security        NN        security
minister        NN        minister
on              IN        on
the             DT        the
grounds         NNS       ground
of              IN        of
a               DT        a
breach          NN        breach
of              IN        of
interim         JJ        interim
peace           NN        peace
accords         NNS       accord
from            IN        from
the             DT        the
1990s           NNS       1990s
.               SENT      .
These           DT        these
laid            VVD       lay
the             DT        the
framework       NN        framework
for             IN        for
talks           NNS       talk
on              IN        on
establishing    VVG       establish
a               DT        a
Palestinian     JJ        Palestinian
state NN        state
alongside       IN        alongside
Israel          NP        Israel
,               ,         ,
but             CC        but
left            VVD       leave
the             DT        the
status          NN        status
of              IN        of
Jerusalem       NP        Jerusalem
to              TO        to
be              VB        be
determined      VVN       determine
by              IN        by
further         JJR       further
negotiation     NN        negotiation
.               SENT      .
Israel          NP        Israel
has             VHZ       have
annexed         VVN       annex
East            NP        East
Jerusalem       NP        Jerusalem
and             CC        and
declares        VVZ       declare
it              PP        it
part            NN        part
of              IN        of
its             PP$       its
eternal         JJ        eternal
capital         NN        capital
.               SENT      .
Palestinians    NPS       Palestinians
hope            VVP       hope
to              TO        to
establish       VV        establish
their           PP$       their
capital         NN        capital
in              IN        in
the             DT        the
area            NN        area
.               SENT      .

As you can see, the identification of TreeTagger is pretty good, but the output would need some analysis to produce a useful set of terms. Furthermore, TreeTagger is not free for commercial use.

CHANGES 1.1.0 (2009-06-29) 1.0.0 (2009-05-30)

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4