gusl: (Default)
[personal profile] gusl
This sounds like an obvious question, but I'm asking because I don't know the answer.

Why do Lisp programmers use Lisp libraries, C++ programmers use C++ libraries, and so forth? Why can't we import the same compiled libraries into all the languages?

(no subject)

Date: 2007-01-19 03:00 am (UTC)
From: [identity profile] gwillen.livejournal.com
Because cross-language calls are generally not considered at all by theoreticians, and are usually a pain in the ass in actual practice, so people avoid them like the plague.

Most languages have slightly different ideas about basic data types, and slightly different conceptions of what a function is and how a function call works.

On the very easy end of the spectrum, take a call from C++ into C. Already the compiler has to distinguish between calls to C and C++ functions so that it can do appropriate name mangling and dynamic dispatch for C++ functions only, while treating C function calls in a backwards compatible manner. And this is honestly a trivial case, since C is a subset of C++. What about calling from C into C++? This is basically impossible. A function call in C++ does dynamic lookup to decide where to jump. A C compiler has no way to know about that -- the programmer would have to code the dynamic dispatch manually if they wanted to call a C++ function.

Now suppose a Lisp program calls into a C library, passing it a list of integers. The C library now gets a complicated tree structure involving pointers with flag bits packed in the lower nybble. If the C programmer wants the numbers in an array, they have to unpack and repack them manually; if they want to work directly with the list as given, they have to essentially import the Lisp runtime into C so they have some primitive functions for working with Lisp datastructures.

And I haven't even considered garbage collection yet. Suppose (and this is a case which happens often enough that there's documentation about the potential issues) a Lisp function calls into a C function, which then calls into a Lisp function (a callback, say.) Now if the inner Lisp function causes a garbage collection to occur, upon returning all the pointers to Lisp objects in the C function will be wrong. Worse, any Lisp objects which were temporarily only referenced by C pointers will be freed altogether. (Imagine the catastrophes if you combined Lisp->C calling with multithreading -- the garbage collector could get called asynchronously at any time during C code holding Lisp object pointers, so none of them would ever be safe to use.)

And I still haven't considered inter-language calls in which one of the languages isn't C. Why not? Because if the languages don't provide bit-level manipulation primitives, and they have incompatible datastructures, it's likely IMPOSSIBLE to crosscall between them without a C shim layer in between to translate. And if their datastructures are sufficiently divergent, it may not even be possible to automatically translate without knowing the semantics of the data.

(no subject)

Date: 2007-01-19 03:08 am (UTC)
From: [identity profile] gustavolacerda.livejournal.com
Why can't compiled libraries work like a collection of little compiled programs? Each calls to the library's functions would be translated to an execution of the program, the inputs would be encoded as stdin, and the return values would be encoded into stdout. If the input and output objects are serializable, this should be easy to do.

(no subject)

Date: 2007-01-19 03:35 am (UTC)
From: [identity profile] bhudson.livejournal.com
It is theoretically easy, and practically painful. You have to get everyone to agree on how to serialize, after all.

Oh, and [livejournal.com profile] gwillen didn't even mention the question of garbage collection.

(no subject)

Date: 2007-01-19 03:35 am (UTC)
From: [identity profile] bhudson.livejournal.com
Uh... pretend I can read.

(no subject)

Date: 2007-01-19 03:38 am (UTC)
From: [identity profile] bhudson.livejournal.com
There's always SWIG which handles enough of these issues to make it be sorta-kinda possible to do cross-language work.
(deleted comment)

(no subject)

Date: 2007-01-19 10:09 pm (UTC)
tiedyedave: (Default)
From: [personal profile] tiedyedave
Yes, CORBA and SOAP and friends do work for this sort of interaction, but they're a fundamentally different approach to the issue: rather than having common representations and call disciplines that both languages use (i.e. within a VM) or that one language uses and the other deals with (such as with foo/C bindings), it's a rep that's external to both languages. It's a practical answer to the "you got your peanut butter in my C++" problem, and in many cases (especially distributed heterogeneous systems that can tolerate high latency) it's the right answer, but it does beg the deeper question of why you needed that communication intermediary to begin with.

(no subject)

Date: 2007-01-19 03:38 am (UTC)
From: [personal profile] neelk

This is a pain in the butt for a couple of reasons.

First, when you compile code, it all works subject to certain assumptions -- for example, when you make a function call you need a convention about where to put the arguments, the return address, the return value, and so on. Each language does this differently, because their semantics differ. For example, a C program doesn't have to do anything to support exceptions, and a Java/C++/Lisp program needs to.

Secondly, in addition to the raw binary the compiler spits out, you also need a runtime system to support the primitive operations of the language. Again, the runtime system will expect data to be in certain formats in certain places, and different languages' runtimes will make different assumptions.

This is why Microsoft calls .NET the "common language runtime", because they supply a common library to be the runtime system for many different languages. Since this forces them to agree on many things, inter-language interoperability is much better (though still imperfect). Of course, if your language doesn't conform to the assumptions Microsoft made, then your implementor's life sucks a lot.

(no subject)

Date: 2007-01-19 11:45 am (UTC)
From: [identity profile] jbouwens.livejournal.com
.NET goes one step further than merely offering binary compatibility between languages. You can mix languages to the extent that it is possible to derive a class in one language from a base class written in another language.

This works great, in practice: I've written C++/CLI classes that derived from C# classes contained in a separate library.

(no subject)

Date: 2007-01-19 04:44 am (UTC)
tiedyedave: (Default)
From: [personal profile] tiedyedave
What [livejournal.com profile] gwillen said, especially wrt datatype representation and garbage collection.


Either your languages have agreed on a common runtime representation for certain types of data, or they haven't. If they have, you're either lucky or in a common VM (see below). If they haven't, then touching data written by the other language will involve a translation each time for any nontrivial data type, and these translations will have a runtime cost, which increases as the interlanguage interaction increases.

Garbage collection makes this quite hard, since interfacing from a language with a managed memory model to one without requires extreme caution, and interfacing between languages with different memory models is usually so inefficient that it becomes impractical.

As a tangible example of these complexities, I offer the ocaml user manual section on interfacing C with ocaml.

Really, the way out of this mess is to use a common runtime of some sort, and thus is born the war of the virtual machines. Ideally, one's favorite languages all compile to bytecode on the same VM and can at least interoperate with civility. Two obvious examples have already arisen: Microsoft's .NET framework with its constellation of front-end languages (C#, F#, VB.NET, etc), and Sun's JVM. The JVM is obviously primarily for Java, but it is accompanied by a miscellany of other language implementations, cf. Jython, JRuby, and more academically Scala.

February 2020

S M T W T F S
      1
2345678
9101112131415
16171819202122
23242526272829

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags