lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


In fact the C standar NEVER forbids different compiled units to redefine the same external symbols.

In fact it even **encourages** such use, which is required for example so that all C programs can have their own implementation of the "main()" function which is externalized to the same symbol once compiled with the same "C" linkage by default.

Then it's not the C language or the compiler that instruct themselves how these compiled units will be linked together. This is only specified at the linker level (which is not part of the C standard itself). But all linkers require you specify unambiguously the order of units.

If **any** linker allows you to use "libraries" (containing multiple units in random order), it can accept to process them ONLY if there's no pair of distinct units in the same library that define the same symbol (otherwise the result would be completely unpredictable).

Shared libraries (or libraries and directories containing an ordering manifest file) are assured to respect this constraint because they are already the result of a prevcious pass of the linker for which the order or resolution was already specified and checked.

Shared libraries are not a requirement for any POSIX system compatible with C: these systems can work perfectly without supporting them (it's enough on these system to have a linker tool that supports a manifest in their supported archive format, or that will look for the existence of a manifest file if libraries are represented as a filesystem directory containing all compiled units).

In practice, archive formats recognized by linkers **already look for a manifest file** containing not just the order of units, but a full mapping index of symbols, associating them with the name of the unit in which they are defined and exported, because it allows much faster linking, without having to process completely each unit file inside the library: such unique mapping index cannot be built and inserted inside the library if there are two units in the library definining the same symbol.

But:
- old archiving tools (like old versions of "ar" on old versions of Unix) did not check that and did not build this index/manifest, it had to be built and added separately to the archive, and updated each time you added/updated/removed a unit from the library. The old "ar" tool is now completely deprecated. For backup/archiving purpose, "tar" is much better and universally supported on all Unix/Linux variants, and now most archives are compressed ("taz", "tgz, "zip", "gz", "xz", you have the choice...)
- archive formats containing a manifest or index solve the problem for linkers (this solution is used by linkers inside virtual machines like Java, .Net, Perl, Python... (Even Lua uses such solution even if it's hidden behind the concept of "loaders", which are actually linkers that programs themselves can control and tune for their needs)
- shared libaries are in fact much cleaner, more compact, and much faster to process to create native programs: the ELF format (or similar variants) is now almost universal between all Unix/Linux/Windows and many other systems (and they still allow compiled programs to invoke or use the system linker system themselves using "loaders")

This solution based on shared libraries (or archives with manifests) and the concept of generic "loaders" is not just for linking standalone programs, but it also exists in scripting languages and programs runing on virtual machines: these programs can control themselves the resolution order, control themselves the environment path in which external libraries or units will be found, they can check contraints like security requirements, access rights, digital signatures; they can use network services to download the units; they can use conditions like the user's locale and other preferences, they can try to best match the architecture such as i386 vs x64 vs. i686 when they have the choice, they can perform comparative benchmark tests before deciding which implementation to use...

**Absolutely nothing in the C standard** forbids units needed or used in the same program to be limited to sets of unique symbols !

All what the C standard says, is that a separate unit will never be created by the compiler such that it contains multiple instances of the same extern symbol with the "C" linkage (Some compilers for specific systems may be exceptions to this rule : they may still create **simultaneously** multiple implementations of the same source, compiled for different architectures, or for different goals such as debugging purpose, or different levels or methods of optimization which may not be safe in all situations such as relocatable vs. reentrant versions for multithreading; in which case the object format will actuall be like an archive with several distinct entry points for the same symbol but distinguished by some encoded goals, or by some "decoration" in the encoded exported symbols; but this is equivalent to exporting distinct symbols; this works only if the linker recognizes the multiple encoded symbols and knows the rules to locate them and to decide predicatably which implementation are the most suitable)


Le dim. 25 nov. 2018 à 02:45, Philippe Verdy <verdy_p@wanadoo.fr> a écrit :


Le dim. 25 nov. 2018 à 02:23, Sean Conner <sean@conman.org> a écrit :
It was thus said that the Great Philippe Verdy once stated:
> Le sam. 24 nov. 2018 à 23:20, Sean Conner <sean@conman.org> a écrit :
        % uname
        Linux
        % cc    -c -o main.o main.c
        % cc    -c -o myfunc2.o myfunc2.c
        % cc -shared -fPIC -o func.ss func.c
        % ar rv libfuncall.so func.ss
        % ar: creating libfuncall.so
        % a - func.ss
        % cc  -Wl,-rpath,/tmp/foo -o smain2 main.o myfunc2.o libfuncall.so
        % ./smain2
        Hello from main
                Hello from func1
                        Hello from myfunc2
                Back to func1
        Back to main
                        Hello from myfunc2
        Back to main

  Happy now?

That's what I wanted. And demonstrates what I wanted to show: this is the only portable and expected behavior !

>> There's no such limitation and portability issue when you don't use ANY
>> static library but only use "prelinked" shared libraries (like DLL on
>> windows or ELF libraries on Linux.. note that executable and DLL modules on
>> Windows, as well as on OS/2, are based on directly on the ELF format) !

 > Citation needed.

I gave citations as examples, because there's nowhere any other counter example using shared/prelinked libraries (or archive formats containing an explicit manifest of the expected link order) any example I can find where this is not true.

So instead I ask you to demonstrate the existence of any counter-example !

The C standard itself does not really indicate how we will link units into a working program: it does not even need the existence of "libraries", it just speaks the possibility of creating programs using collections of units compilable separately (that's thbe meaning of the "extern" keyword in C), and then it needs to use an external linker and to specify the units you need for a working program in a well defined resolution order (and basic libaries which are only collections of units in unspecified order are not compatible with this goal).

But shared libaries/ prelinked libraries are compatible with this goal as their role is to prelink them partially, using also the same linker and the same specified order, which then gives a predictable resolution order for symbols in the library (something impossible to reach using only basic static libraries containining arbitrary number of object units in random order).

It's not a basic "librarian" tool that resolves symbols (the librarian is just a way to archive and pack multiple files into a single one, because it is generally faster to process than large collections of small files, from which they can be extracted without modification; the speed of processing was certinaly true with past filesystems, but it's no longer the case with modern filesystems). Only a true linker (actually made for building programs) does that work of resolving symbols accurately and predictably!