Parsing a paragraph: detecting lists
Let's say I have the following text:
Steps toward this goal include: Increasing efficiency of mobile networks,
data centers, data transmission, and spectrum allocation Reducing the
amount of data apps have to pull from networks through caching,
compression, and futuristic technologies like peer-to-peer data transfer
Making investments in accessibility profitable by educating people about
the uses of data, creating business models that thrive when free data
access is offered initially, and building out credit card infrastructure
so carriers can move from pre-paid to post-paid models that facilitate
investment If the plan works, mobile operators will gain more customers
and invest more in accessibility; phone makers will see people wanting
better devices; Internet providers will get to connect more people; and
people will receive affordable Internet so they can join the knowledge
economy and connect with the people they care about.
As you can tell by reading the text, these are multiple sentences (a list
of points). How can I split this text into sentences? I've tried using
python NLTK but no luck. Checking for uppercase letters won't work either,
as it isn't very reliable.
Any ideas on how to solve this problem?
Thanks.
No comments:
Post a Comment