|
On 5/15/2013 7:14 PM, Steve Litt wrote:
On Wed, 15 May 2013 13:01:12 +0100 Finn Wilcox <finnw@finnw.me.uk> wrote:But the point I was trying to make was that it is pretty much useless, other than causing inconvenience (and for the target audience of a programming book, that inconvenience is quite small.)You might want to reconsider the "quite small inconvenience" assertion. OK, I could paste it into a little GUI that transliterates the characters. But: 1) That's an extra 10 seconds EVERY TIME I want to test out the book's code. 2) It appears that your conversion removed all indentation. <shout>The obfuscation of copied and pasted code has no business in a programming PDF eBook!!!</shout> I really think they should recompile the PDF so that code is copyable intelligently, complete with indentation, and then offer the revised PDF free of charge to people who have bought the book. I bought this book specifically so I could copy, paste, try, and mess with the book's code. The ability to copy and paste is why I buy PDFs instead of print programming books.
I haven't seen the file (as I have the book), but a proper pdf file has the right information to do cut and paste. The information can come from encoding vectors and/or tounicode entries. As fonts get subset the stream contains references to indices in a subset vector. So, there's a lookup chain with several steps (with the endpoint being the used encoding in the editor or clipboard).
If cut/paste fails then there are several possible reasons: - the font encoding has meaningless glyph names (not adobe name conforming) - the files has no tounicode vectors (these more robust than names)- the final target has some fuzzy encoding (expects some codepage and gets utf or so)
- for more complex scripts and/or specific features (like ligatures) some disassembling has to take place
Anyway, it's either a pdf generator issue or a pdf viewer issue (more likely).
Obscuring a file by juggling with numbers is possible but then search also fails so that is normally not done. Pdf files that are to be protected have some encryption and/or have some flags set that indicate permissions (these are supposed to be honored by applications).
Concerning indentation: the copy operation has to make a guess about that and map offsets onto spaces and some kind of formatted listing can be either display material or just inline, so it's always a guess.
Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------