[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Tricky little string parsing challenge
- From: "szbnwer@..." <szbnwer@...>
- Date: Tue, 19 Mar 2019 17:19:32 +0000
+1:
1.2.3:
a) ...
b) ...
ba) ...
bb) ...
c) ...
just try to be smart based on the parentheses! :D btw the law stuffs
are kinda very well formed stuffs actually, so thats a good
playground. :D (<--my future plans) nlp is actually a thing where i'd
seriouly considerate to get a heavy-weight tool for anything
universal, while i'd still prefer an own one, for my custom needs, my
own mental models, better understanding, and what not. if i can parse
the language, that the nlp stuff was written in, then i can easily
collect out its data set goodies, while probably i can automatize it
for a next version....
bests! :)
2019-03-19 17:08 GMT, szbnwer@gmail.com <szbnwer@gmail.com>:
> hi there! :)
>
> an another gotcha is when an abbreviation with a point after it can
> stand on the end of a sentence, and in that case, theres no extra
> fullstop, at least in hungarian, but i think in english too... and the
> previous wasnt 3 sentences, while this is one! :D btw what about
> factorials (5!=120)? about malformed texts, like mine? u can only
> reach higher and higher precision, but its a hard nut to make it it
> perfect... what u can actually achieve is to make it semi automated,
> and completed on a single document base, like collecting the nasty
> bits (punctuation, float numbers and capital letters), and handselect
> them, or give it some additional rules in general, or on a document
> base...
>
> good luck, have fun! :D
>