On Wed, Apr 18, 2007 at 09:54:39AM +0100, David Jones wrote:
> Phase 3 does not need a little bit of knowledge from Phase 4.

Footnote 6 (admittedly non-normative, but read on) seems to explicitly
state that it does.

> The problem you referred to earlier was that of differentiating the  
> string "foo\nar" from the included file "foo\nar".  In stage 3 there  
> is no distinction, it's all just pp-tokens.  You can create a problem  
> for yourself if you decide that your frontmost lexer can distinguish  
> strings from included files, but really the C standard says that  
> strings don't become strings as we know them until stage 5.

Right, one problem is if you're trying to categorize the pp-tokens
before passing them to phase 4.  The sequence


is ambiguously either a header-name or string-literal when you don't
have the phase 4 context available.  As you say, one approach is to
just consider it a generic pp-token and figure it out later.

But there's a more difficult case I forgot about:


which can be pp-tokenized two very different ways:

  punctuator identifier punctuator

Choosing the correct pp-token sequence here does require phase 4
context.  The Standard explicitly mentions this problem (as well as
header-name vs. string-literal) in 6.4p4.

                                                  -Dave Dodge