[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Tricky little string parsing challenge
- From: "szbnwer@..." <szbnwer@...>
- Date: Tue, 19 Mar 2019 17:08:22 +0000
hi there! :)
an another gotcha is when an abbreviation with a point after it can
stand on the end of a sentence, and in that case, theres no extra
fullstop, at least in hungarian, but i think in english too... and the
previous wasnt 3 sentences, while this is one! :D btw what about
factorials (5!=120)? about malformed texts, like mine? u can only
reach higher and higher precision, but its a hard nut to make it it
perfect... what u can actually achieve is to make it semi automated,
and completed on a single document base, like collecting the nasty
bits (punctuation, float numbers and capital letters), and handselect
them, or give it some additional rules in general, or on a document
base...
good luck, have fun! :D