Log in

No account? Create an account
Evil, but cute & gay about it
...ramblings of the imperfectly innocent
Important safety tip 
24th-Nov-2008 05:10 pm
gabe, frustration
After pulling my hair out all day trying to figure out why important code suddenly got 10-20% slower, I noticed a parenthetical aside in the gcc documentation that casually mentioned that functions won't get inlined when they are used before they are defined, even if they are inlined explicitly. Which means that stuffing all your tiny static inline utility functions at the end of the file is great for readability but sucks for performance. Awesome. Oh, and "-Winline" won't mention this either. Thanks, gcc!
(Deleted comment)
25th-Nov-2008 02:44 am (UTC)
You are clearly much less cynical than I, since you are implicitly assuming they will treat this as anything other than "works as designed". :)
25th-Nov-2008 02:11 pm (UTC)
Are you saying GCC was bought by Oracle?

Me: "This feature doesn't work."
Oracle: "Why are you doing that?"
Me: "You have three chapters about this feature in your documentation."
Oracle: "Oh, you shouldn't do that. We should remove it from our documentation."

Sigh. That was really a call.
25th-Nov-2008 03:02 pm (UTC)
Heh. Oracle is awesome.

Some of my favorite all time pieces documentation comes from them.
Somebody should make a Oracle one-a-day calendar.

Anyway not quoting the exact text...

"Oracle implements all syntaxes specified in RFC XXXX, RFC YYYY ..."

... goes on to list every RFC syntax ...

... much later on ...

"Oracle only enforces the syntaxes of Telephone Number and Octect String."

For those of you who are not LDAP people. This is like saying you've implemented every data types and every schema constraints specified in SQL2008, but later on mention your product stores all data in VARCHAR and while allow you to create any constraint it only "enforces" the UNIQUE() constraint.

Oracle Internet Directory is probably not the worst LDAP out there, but IMO it is easily the most bizarre LDAP implementation.

25th-Nov-2008 04:15 pm (UTC)
I think AD is even more bizarre. If you've ever tried to program against it, you'll find it extremely frustrating as it tries to "help" you by denormalizing returned data. So when you go and, say, try to make a change, it will barf back at you as the data you're trying to modify aren't where it just said it was, but in some other node of the tree.

Yay for introducing unhelpful helpfulness!

25th-Nov-2008 07:13 pm (UTC)
Yeah AD is weird too, but in fairness it is a pretty complete implementation for an application specific LDAP. Much more so than say ... RACF's LDAP interface to their AuthN/AuthZ data.

Now, RACF's ldap is weird.

I don't _like_ AD, but then I don't like OpenLDAP either. :P
Ok, I'd much rather use OpenLDAP than AD, but thats not the
point, neither one sounds like fun.

For all the flack AD gets it is not the worst, IMO. Just the one pushed down most peoples throat. AD can't federate and I wouldn't want to try and keep a secure, but as a application LDAP it is _not_ that bad.

Well I suppose more to the point most LDAP implementations suck.

25th-Nov-2008 10:05 pm (UTC)
On that, I cannot disagree.
25th-Nov-2008 01:57 am (UTC) - err...
Doesn't everybody know that things have to be defined before they are meaningfully usable?

I have always been a "main a the bottom" guy because there is no way for a compiler to intelligently judge the complexity of something it hasn't yet seen, so putting the definition of _any_ function after its first use in a module is first-order optimizer sabotage.

This is also the same reason that, when using classes in C++, I put the private, then the protected, then the public parts of the class in that order. None of this public at the top nonsense. It is just as easy to find at the bottom of the class/

Both techniques are also (IMHO of course) easier for the follow-on programmers, since they then know that they always only need to search up through the code to find where things are defined etc. None of this look up then down then up nonsense for my code, no-sir-ee.

Then again I was raised in the prehistory of modern languages when a second pass of the compiler was painful and time consuming.

This is also a reason why inlined functions live in "include books" in older mainframe-like systems.

Don't let "Java 101 for Art Majors" coding styles lead you astray. The place I went immediately after my "86 Days Of Shame" at F5 had a "Functions May Not Be Defined in Class Declarations" standard, and an "All Class Variables Must Use Accessors" standard.

I was powerless to make them understand that they were turning almost every integer fetch into a far function call across a module boundary. I did not accept their offer of permanent employment. 8-)

Most of the habits and patterns that are used in Java (and other byte-code interpreted and straight interpreted languages) are antithetical to compiler best practice.

If you reorder all your source files so that definitions precede use, hopefully to the extent that about 80% of the functions no longer need separate declarations, you will notice a nontrivial increase in the execution speed of your system.

Likewise, declaring everything you can "static", and proper use of __attribute__ ((visibility("the-appropriate-visibility-class-for-this-function")) will speed loading times and reduce symbol table size (and associated symbol table runtime memory) in any non-trivial application. Not that most people even take the time to care.

Funny how all that crap is in there for a reason but nobody ever reads the manual. 8-)

Funny "sad" not funny "ha ha"....

25th-Nov-2008 02:10 am (UTC) - Quick Clarification
The "everything" I was recommending declaring "static" was the set of all things that are function like who are never accessed via pointer-to-function. I was not talking about making data static per-se.

Oh, and while I am gnarfing about things, on old-style C, remember the "two-const"s rule for pointers.

int F(const char * const thing);

int G(char * const thing);

int H(const char * thing);

For any operation epsilon, defined by these co-variant functions, first F(), then G(), and then H() will produce more optimized code both within the body of the function implementing, and within the context calling, epsilon. The constness of the pointer usually making a stronger warrant against data aliasing than the constness of the character pointed too.

The "const" in the second position is analogous to the call-by-reference operator in C++.
25th-Nov-2008 03:10 am (UTC) - Re: err...
Doesn't everybody know that things have to be defined before they are meaningfully usable?

Ahh, but they don't; they merely have to be declared. They have to be defined somewhere, but in most cases it doesn't matter where. It does here, hence the complaint.

This is also (supposed to be) why we have two-pass compilers: to handle this exact case where something is used but not yet seen (defined) on the first pass.

None of this public at the top nonsense.
None of this look up then down then up nonsense for my code, no-sir-ee.

Um, what? :-)

In that style of code, if I am a user of that class, then I will do a grep or tags search for it, which will put me back up to the class, and then I have to search down for the public functions. Public at the top is what prevents up-then-down. It becomes just-up! I must be misunderstanding what you mean...

main at the bottom

Yuck! So, when presented with a program to understand, I have to go to the bottom, then scroll up to the top of main, then scroll back down to read it, then scroll back up past what I just read to go to the top of an ancillary function, then scroll down to read it, then back up for the next function, &c. &c....

If main is that the top, I start there, and just keep scrolling down, down, down. *Much* better IMHO.
26th-Nov-2008 03:26 am (UTC) - Re: err...
If you "declare" something you state that it will exist, but you don't tell the compiler how large/complex it is, so the -only- thing the compiler can -possibly- do is put a far call in the code. If you "define" something you give it body and the compiler knows its exact geometry, it can then judge that complexity and decide whether there is a gain to placing the code inline, and if it is inlined it can how much space it will take. Without the definition the only thing you can do is put in the far call, otherwise the compiler would have to inject block of nonsense (no-ops etc) into the instruction stream and hope the function fit, and then try to burp them back out when they were left over. Either that or make repeated passes over the code until it ran out of complexity (e.g. resolve all the inline calls inside inline functions etc).

So as stated, you can not _meaningfully_ use something before it is defined. Yes, you can call it or declare a pointer to it, but none of the manipulations are possible.

So for exactly the reasons that you cannot use a forward declaration (e.g. "struct Bob;") to define a variable of type Bob (e.g. "struct Bob MyBob;") but you -can- use it to define a pointer to a Bob ("struct Bob * pointerthingy;" because a pointer to any struct is defined and of known fixed size); you are stuck with only being able to do pointer-like operations on a function which is -declared- but not -defined-. So you can call and take the address off a function after it's declared, but you cannot meaningfully manipulate it (e.g. inline, find the size of if your compiler lets you, etc) until it is defined.

I am sorry, but that is compiler 101 back when they used to actually teach internals in school. I'm surprised the documentation even footnotes the obvious fact that you cannot inline an declared but not-yet-defined function.


Public at the bottom only causes up-down-up if you are not capable of remembering which class you are in, and cannot deduce it from the name of the constructors and/or destructor. Find class, scroll down to public, you will have a passing understanding of the entire private and protected interface (and some clue as to the complexity of the class) by the time you hit public. But that's nearly beside the point.

[Aside: a conforming C++ implementation will relieve you of some of this because the language standard requires the compiler to treat all functions as if they were atomically defined at the close-brace ending the class definition. You will still find your compilations taking less time if you put the private and protected functions that are defined within the class "above" their first use in the public functions defined within the class definition. This being the other reason to put private and protected elements in that order preceeding the public sections.]

Hit a brakepoint or grep for -any- symbol, see code pop up in the editor. If the code is arranged in compiler-friendly order you _always_ know that any symbol you encounter is "up" from where ever you are. And if you scroll up, you will find the definition of the function, or if you find just the declaration then you "know" that either (a) the function is ugly enough to need to be circularly defined, or (b) it's in a separate translation unit.

In backwards "main at the top, public at the top" you may, on arrival at a line of code, be either above or below the definition, requiring you to search in both directions for the definition.

Once you are scrolling back and forth, you are, by definition, going up and down.

But on raw entry, if the code is in main/public at the bottom every time you enter into a source context you will know that "answers are above".
26th-Nov-2008 03:27 am (UTC) - Re: err... (part 2 8-)
You have to scroll to the bottom! "Toh Noes!" 8-)

Seriously, in a main-at-the-bottom source module, and any primary-entry-point at the bottom source module in any language, you _always_ know that the primary entry point is at the actual bottom.

With main-at-the-top you actually get main-kinda-near-the-top because the actual top of the top is your includes, and then your typedefs, and then your inlineable functions that you might use in main etc. So main at the top leads to a lot more searching than pressing ctrl-end (or :$ in vi etc) to go to the bottom.

This is so much a part of compiler design and "good thinking" that in many languages the language actually requires it (see things like Ada module initializers etc). There are still other languages that don't allow separate declaration of functions (see erlang, and bash, and tcl etc). Further still many psudo languages and language definition systems (e.g. BNF etc) are strictly define before use as well.

By studiously applying the "define before use" rule, even in C where you only have to declare before use, you will find it much easier to jump between languages because you wont have to change habits. There will be no contextual mental shift. Everything you use will be defined above any use of it every time.

Heck, one could argue that you are biologically and neurolinguisticly (however the hell you spell it) wired up to more naturally process definition before use. It is the natural language way of processing and communicating information. Just check the rules of style for things like using acronyms in a document etc.

Main somewhere near the middle of the top is the stupidest rule ever, and more-so once you know about the inlining requirements and natural language design. Its like putting your conclusions at the start of your thesis, it is antithetical to human understanding.

I can't help it if you were taught backwards... 8-)
25th-Nov-2008 02:35 am (UTC) - Odd ..
I would have thought it wouldn't have inlined them before either. I also would have expected some warning that the function was getting the default implicit prototype with -Wall, but perhaps gcc did not do that


If I am understanding this problem properly, gcc's -Winline behavior is sorta excusable too, but would have certainly made me sad/angry in your shoes though

Ah C ...

25th-Nov-2008 02:54 am (UTC) - Re: Odd ..
Oh, no no no. The prototype is quite explicit, even mentioning the static and inline aspects. If the code looks like:
static inline int foo(int x);
int bar(void) {
  int a, sum;
  sum = 0;
  for (a = 0; a< 10; a++) {
    sum += foo(a);
  return sum;
static inline int foo(int x) {
  return ( x<<2 | 0x1);

then foo() is *not* inlined in bar(), and -Winline won't tell you that.
25th-Nov-2008 03:09 am (UTC) - Re: Odd ..

Well thats different... and perfectly reasonable.

In that case .. Shame on you gcc!
-Winline .. Fail
inlining .. Fail

gcc .. Double Fail
26th-Nov-2008 03:41 am (UTC) - Re: Odd ..
How was the compiler supposed to know hat big foo() is before it is defined? Without that knowledge the compiler cannot (a) insert the code, (b) optimize the inserted code, or (c) calculate the "relative jumps" necessary to make the for loop.


The compiler could be re-written to make as many passes across the source code as necessary to handle things like if foo() were to call some function baz() that was also inlined. Which sounds like a win until you consider that this compiler has to be able to compile multi-thousand lines-long sources.

This issue of inlineing functions is identical to, and for the same reasons as, your inability to do the following.

struct Foo;
struct Bar {
struct Foo MyFoo;
int Thingy;
struct Foo {
char space[200];

A declaration of any element of any type in C, including a function, only allows you to declare a pointer to that thing, and dereference that pointer. Calling a function is effectively dereferencing a pointer (that is, an operation of known size and complexity that the compiler or linker can fill in later).

bitmonger... tripple fail... 8-)

26th-Nov-2008 04:50 pm (UTC) - Re: Odd ..
I don't see why -WInline doesn't work. Gcc knows an inline is on the declaration. So, it should emit a warning during code generation when emitting a non-inlined call.

I actually do write my C programs to the style you describe incidentally. I understand gcc can't emit an inlined call until a definition is encountered, but not warning with -WInline is disappointing. Certainly, gcc is already is a multipass compiler, I would have imagined for a small unit of compilation gcc would have inlined it. I should not have to check the compiler's asm.

I see what you are getting at, but the difference in your struct example seems large to me. First, it is not valid C. The struct definition lacks a prior declaration. As opposed the this example where the function definition is after the declaration, but before a call.

For all the odd warnings gcc emits, -WInline should be clear cut. Gcc knew it was marked inline by declaration and made an inlining decision.

I should not have to look at asm to see if something inlined or not that is what -Winline is for. I agree that gcc's inlining choice it legal, I would argue sub-optimal, but the '-Winline' behavior is wrong.

I was curious how common this was..
I only have sparc and intel handy right now ...

[ inlined (gcc -02 / cc -O -xO3 )]
x86 gcc (apple's) 4.0.1
x86 solaris gcc 3.4.3
x86 solaris sun studio 5.8
sparc solaris gcc 3.4.3
sparc solaris 8, cc 5.4

-Winline worked everywhere I tested it and gcc and cc basically
inlined as I expected them. Sun studio 5.4 required -x03, newer versions did not (-O was fine), gcc4 (apple's build) also did not seem to require -O2 (-O was enough) which surprised me.

25th-Nov-2008 05:49 am (UTC)
Welcome to 2008, where C++ compilers still act like glorified macro assemblers...

Use perl instead... :)
25th-Nov-2008 07:08 pm (UTC)
Heh. As much as I love perl, it's not what I would call appropriate for high-speed packet processing. ;-)
25th-Nov-2008 07:45 pm (UTC) - perl gets no respect :P
Actually, it depends on what you mean by high speed.

I written complex pure perl network/OpenGL display application.

It did better than I needed. Cache things that are too expensive use a few complex pack()s instead of many simple ones. All in all not too bad.

Now, I am not recommending perl for everything, but for
IO it does pretty well.

I personally think C++ is poorly suited for many places it is used. I actually like a modern C dialect like C99 over C++. If my program is that complex that I want lots of classes and OO ... I'd rather not do that in C++ which I think is too hard to refactor by comparison to C or Perl.

I personally think both C and Perl are easier to refactor than C++.

XS + Perl or some other HLL + C binding does better than most people

Incidentally, Radiator is written in Perl. It performs pretty well. It does have one major performance issue: EAP, but that is because it was designed before EAP existed when all radius requests were cheap so EAP doesn't fit into their architecture well.

So, I think Perl does very well with IO. It doesn't do so well with memory use, but that can be fixable by some other layer DBM, memcached,
mmap(), XS, database ... If you have a lot of data.

Numerical speed in perl is poor, but most times this is not much of
an issue.

Just for fun ... not that this is a 'fair' test.

$ time wc -l /var/log/syslog/daemon.log
897124 /var/log/syslog/daemon.log

real 0m6.997s
user 0m6.021s
sys 0m0.944s
$ time perl -ne 'END { print $. }' /var/log/syslog/daemon.log
real 0m4.016s
user 0m3.281s
sys 0m0.735s
$ time ggrep -wc . /var/log/syslog/daemon.log

real 0m2.848s
user 0m2.307s
sys 0m0.542s
$ uname -a
SunOS noc1.dccs.upenn.edu 5.10 Generic_118833-36 sun4v sparc SUNW,Sun-Fire-T200

Line counting

first place: GNU grep
second: perl
third: wc

(log file was 'live' so that is why numbers change)

25th-Nov-2008 07:51 pm (UTC) - Re: perl gets no respect :P
I just realized perl -0377 is going to be faster.

3.35 seconds now in fact ... so it is still in second place.

Again these numbers don't really mean anything, but it does show language isn't everything.

I mean wc -l should be _fast_, but its not.

27th-Nov-2008 02:29 am (UTC) - High-speed packet processing?
I've been playing with erlang and that does high speed packet processing better than just about anything. 8-)
This page was loaded Nov 13th 2019, 2:44 pm GMT.