Changes to this document since its original publication are marked with changebars on the left.
In October 2004 various members of the Mono group from Novell, Mainsoft as well as a couple of Mono contributors met for a week in the Novell offices in Cambridge to discuss Mono.
In some of the meetings we took notes, and this document is a compilation of some of the topics discussed on those meetings. It is not a comprehensive list of the discussions that we had.
Many things are likely missing, and we have tried to put some cohesion to the minutes, but they should be taken with a grain of salt. We will be glad to update this document if there are questions and use change-bars to document the changes, send your feedback to firstname.lastname@example.org.
This document is in no way authoritative and might contain errors and mistakes.
The face-to-face meeting this week was fantastic, it allowed members of the team to learn about the issues being faced and the solutions that we are looking for.
It was also an experiment in maxing out an american express expense card.
Mainsoft and Novell share a large amount of code from the Mono class libraries. Novell consumes and ships those as part of the Mono project an ECMA CLI implementation. Mainsoft consumes that and ships it as the Visual MainWin for J2EE.
Mainsoft has contributed in the past to both the System.Data, System.Web, Basic assemblies in the Mono CVS repository, but as the ship dates for each product approached the code bases have slightly diverged.
During the meeting we discussed an ongoing strategy to merge changes in the Mono repository that would help Mainsoft keep track of our code base: merging bits which are Java specific to our tree and using conditional compilation and integrating their test suites into Mono and have them run continuously to allow them to integrate the class libraries without risking a loss in functionality.
During our discussions we learned a lot about Mainsoft's current work on scalability of the Mono class libraries and we are going to setup a few test labs to reproduce their setup for testing and QA.
In general, it is easier for Mainsoft to consume pure managed code than code that uses P/Invoke, so we will try whenever possible to reduce our dependencies on external libraries.
The Mainsoft developers sat down with us to discuss bugs in our code base and discuss performance problems and pinpointed them to us. We wanted to thank them for coming to Boston.
Chris Toshok and Martin Baulig worked tirelessly this week to stabilize the debugger. Chris described this week as his `most intense hacking week since college'.
Tasks accomplished this week:
Attaching to running processes:
Inline: Currently this optimization is turned off and happens in a single basic block. There are no known bugs on it, but we should start using it on a daily basis to adjust the inlining threshold and fix any bugs that might be exposed by it.
If the threshold is too high we generate a lot of code and bloat the program at runtime, currently the condition is set to only inline methods of 20 bytes of IL code or less. Luckily most accessors fit nicely here, so we get most of them.
It is important to start profiling this data and find the right spot.
For inlining to be fully effective we need copy propagation and dead code elimination. Today we only do local copy propagation by default, global copy propagation is only enabled if the SSA optimizations are turned on, Massimiliano will implement a version for the non-SSA case.
Small benchmarks with inlining and copy propagation show that copy propagation helps as much as 2x.
Inlining should be disabled in a few cases (anything that uses Declarative Security for instance).
Once dead code elimination is present we can make this the default. Massimiliano will be responsible for completing this.
Ahead of Time Compilation: AOT is a key feature in Mono, it compiles the managed code into native code which reduces JIT time and allows us to run all of the time expensive optimizations.
We would like to distribute packages with AOT turned on, but today there are a couple of reasons why we do not do this:
We are addressing these issues as follows:
The code will use an ELF-like setup for accessing global variables and invoking methods (a Global Offset Table for variables and a Procedure Linkage Table for invoking methods).
Zoltan has expressed his interest in implementing this and has already started reworking the file format (changes are on CVS).
New Trampolines: Today we use two kinds of trampolines at runtime: calls and vtable slots use one, and delegate loading uses the other (ldftn). For vtable slots and calls, we patch the call site with the new JITed address for delegates we have to jump into some fairly slow code.
Paolo has developed a new implementation that is almost ready to be checked into CVS. The new code uses a single trampoline which is always patched.
The new code reduces the number of per-architecture code that is needed and also allows for dynamic recompilation of "hot" methods.
This change will benefit IKVM and IronPython performance the most.
Synchronized methods will have the synchronization primitives injected into the JITed code instead of using a wrapper to improve performance.
AppDomain: We want to move into a setup where mscorlib is always compiled in "shared" mode. Currently since this is not the case when we pass strings from one AppDomain to another we must marshal the strings (and any other objects) and re-create them on the receiving end.
Compiling mscorlib in shared mode would get rid of this problem. The only issue is that this has an impact on performance (anywhere from 10% to 20%). The solution in this case is strengthening our optimization and make AOT compilation the default.
We need a portable way to access the Thread Local Storage (TLS). Windows and MacOS have mechanisms for doing this, we would need the io-layer to expose this on all platforms.
AppDomain unloading: We need hooks in GC to walk the stack and identify any live objects to safely shut down the application domain. This problem seems to have been fixed by Zoltan on Monday 25th.
AppDomain Thread Object: Today a major problem we have is that the Thread object is shared across multiple applications domains and we produce references to objects in a different application domain if certain fields on the Thread object are accessed (most likely through the CurrentThread property of the AppDomain). This can lead to crashes, and is visible when people run more than one ASP.NET application at once.
Possible options: a) We could create different Thread objects for each AppDomain that point to the same thread; b) Morph the thread object when switching AppDomain.
Zoltan expressed interest in fixing this issue, but it requires research and understanding of the semantics of the CurrentThread object.
Precise and Generational GC for Mono: This would be a Mono 2.0 feature, this is a long-term project, short term we will continue to use Boehm.
Eyal pointed out that a new GC would be nice, and would avoid heap fragmentation, plenty of non-Java server applications today face this problem and it has worked fine for most.
There are two separate issues: the GC algorithm and the runtime support for it. Runtime developers should start using the precise-handle API and be aware of it for any dynamically JITed code.
We still hope to reuse the algorithm from an existing third party implementation.
Linear IR representation: Today Mono goes from the CIL stream into a tree representation where we run instruction selection with pattern matching and produce a linear list of instructions. This was done because the x86 instruction set benefits a lot from good pattern matching into its instruction set, but it poses several problems.
Today most optimizations are done on this tree of instructions and then the native code is produced from this.
The most important problem is that by having a tree and producing the IR too early there are too many optimizations that can not be done effectively, and most RISC architectures benefit little or nothing from this transformation. On the downside things like constant folding become more complex on this representation.
After Mono 1.2 we will move into a different setup: the CIL instruction stream will be translated directly into an instruction list so we can perform some of the optimizations that today are done too late and in synchrony with the updated register allocation (see mono/docs for details on the new register allocator). At this point we will apply instruction selection used a modified version of monoburg for x86 and x86-64, the other platforms have no need for this.
Code Access Security (CAS): Mono's CAS implementation is moving along. For our next release of Mono the VM support for it will be ready, but we will not have it activated by default as a security audit will be required to add all of the demands required.
Link demand is mostly done except for some corner cases, and we need to be able to do stack walks.
Sebastien will design the format for the security information on the stack.
We are going to make the verifiable code generator not abort, but instead throw an exception (some work is ongoing in this area) as well as removing asserts from the metadata engine and cope with invalid images as well.
Exceptions and Signal Handlers: The discussion centered around the handling of NullReference exceptions in unmanaged code: should we turn these into an exception or should we abort (as this is likely a big error on the libraries or in the programmer's code).
The problem with aborting is that there are cases where we are using helper methods that happen to be unmanaged (like memcpy or memset to implement cpblk/initblk) and we need to be able to distinguish from an error and a valid exception being thrown.
We argued that initblk/cpblk should only have valid references to start with, and any misuse of it meant that the programmer had done a fatal mistake.
Another point discussed is how to handle StackOverflowException: today we can cope with it if the user builds his own Mono, but we can not package the software and have this enabled by default on all platforms, it is not working.
SSAPRE: SSAPRE will be done soon. Massimiliano was able to JIT its first SSAPRE method without any problems on Friday, and is now cleaning up the code and fixing the bugs to run Mono with this optimization turned on.
The next step includes doing critical edge removal (to reduce the code size and avoid incorrect code from being generated).
Massi also wants to incorporate the benefits of GVNPRE into his SSAPRE implementation and has a plan for getting these features into our JIT engine.
Thread Setup: We will be removing the current extra thread that is required to run Mono applications. This means that OSX developers will get their main application executed on the main thread, and that the embedding API will be simplified by not having to provide a callback mechanism for starting up your application.
RWLock: Mainsoft contributed a new read/writer lock that we will be integrating into Mono soon, as this is a source of scalability problems today.
Other optimizations: After looking at inlining and SSAPRE, Massi will look at loop unrolling and induction variable elimination.
This is one of the most important assemblies for people deploying applications on servers. Today most of the providers were contributed by developers in the community and they are updated as members of the community update them.
We need to focus on a couple of providers to fully support and maintain. From an open source perspective the Postgress and FireBird providers is the best maintained, and we would likely choose a couple of proprietary drivers to be fully maintained by Novell, leaving the rest to the community.
Mainsoft offered to develop and maintain the disconnected mode of operation for System.Data as this is code that will be shared, but for the connected mode they will be using the underlying Java libraries so we would not be sharing this code. The rest of the Mono team should focus its efforts on the connected operation mode.
Mainsoft suggested that we move to a complete reference/base-class provider implementation and assist the providers to reuse as much code as possible from this reference/base-class implementation to reduce the amount of replicated code.
As for the open source provider and our choice for Postgress, the only reason is that the Postgress provider is actively maintained by Francisco Figuereido and the code resides on CVS. Carlos is interested in maintaining the code for the Firebird provider on CVS as well.
The Firebird provider has an extra plus: Firebird can be used as an embedded database allowing server-less deployments.
We need to rename the ByteFX.Data provider to something else as Reggie has left that provider (as agreed with Reggie Burnett).
There are good news: the upcoming version of .NET has shrunk the number of features that were added to the XML stack, so this reduces the amount of work that must be done to release the .NET 2.0 stack.
We had been working on an XQuery implementation for our stack, and with this change we have moved our implementation into a separate assembly (Mono.Xml.Ext) so people interested in XQuery and XPath2 can still use it (on both Mono and .NET).
Mainsoft is looking into building an XPath/XSLT compiler that will improve the performance of XPath and XSLT transformations. This could be worked in parallel and deployed when it becomes ready.
Lluis Sánchez demostrated a new code compiler: this is an API that simplifies the generation of dynamic code. It is a fairly high level system: it is a higher level representation over IL and Reflection.Emit and a lower-layer than CodeDOM.
The benefit of this new API is that the code generation phase is very easy to read and maintain in the host source code and it takes care of all of the IL generation details for you and completely wraps Reflection.Emit. It is a much simpler solution and simpler to debug than Reflection.Emit for those interested in generating code dynamically for performance reasons.
Lluis prototyped this idea for avoiding the use of the C# compiler when generating XML Serializers on the flight, but it can be reused in many other contexts.
There are still no plans on where this code will be hosted in the repository.
Lluis has posted a description of it, as well as the source code on his blog entry
We have chosen to migrate the Mono CVS repository to Subversion. Todd Berman has been kind enough to do a couple of dry runs and we expect to announce the exact plans of migration in the next couple of days.
Everyone with a CVS account has already a Subversion account on the Mono repository, and we will follow up with instructions and dates on the migration.
After we migrate to Subversion we will move the `mcs' directory inside the `mono' directory and unify the build system so a single `make' command will build both the runtime and the class libraries.
This change will be done to distribute a single source code package with a minimal system to bootstrap. The intention is to allow third party distributions to apply patches to the C# source code and be able to apply fixes and patches to the source code easily, currently this is hard for a third party distribution to do.
The minimal bootstrap system is a set of CLI binaries for the C# compiler, mscorlib, System and System.Xml.
The security requirements for the the 1.1 profile require us to do a multi-stage build of the corlib and related libraries. Today we already do two passes for System and System.Xml as they reference each other.
We are considering using a Mono wrapper program that will invoke a binary that has been tuned for a particular architecture: if 686/tls features are present, we would use one binary that has been tuned for this platform, if those features are not available we would use a different binary.
This would have to be done in the packaging stage, not something that we can really/want to do on the default tarball.
Having a wrapper also means that we can safely set the program name to the name of the target executable, which is convenient for those who want to kill just the right process (Developers used to `killall mono' today bring down everything they are running with).
We had a long discussion on packaging. Today Mono is distributed in RPMS that are too granular and have lots of cross dependencies. The quick solution is to either use Red Carpet or install every package (for those more desperate).
The current split is too much, so we are going to be repackaging things so that the Mono package comes out in twelve packages as opposed to thirty-something.
There will be a `mono-core' package that contains only the minimal system required to run Mono which was a common request.
The daemon takes care of three things: process management, file share recording and shared memory contention.
The IO layer manages two kinds of resources: resources local to the process, and resources that must be shared across various Mono processes.:
There are a couple of issues today: FileShareMode hashtable is a shared resource and can slow down significantly applications that do a lot of opens.
Windows process do not have parent and child processes, they have siblings With a process id you can get a handle to a process and get the processes exit status a process that forks a new process automatically gets the new processes handle.
If you have two terminals and you run an application the daemon is started, then if you start a new process in another terminal the user limits applied to the new app will not be applied to the daemon.
On windows the parent can start a new process and then exit.
pthread has support for shared mutexs but pthread has no support for naming them we could use semaphores
fileshare modes can be stored in the shared segment. - We know how many files can be opened so we can create an array (this can change dynamically though, so we might have to resize)
We can define a signal and when a mono application recieves a signal it can alter the shared memory. We should choose one that is ignored by default
Should use a point to point type system instead of haing a single process update
Each process can allocate 20 slots, with some metadata (process, timestamp....), it will reference only its own slot, each slot will have a type, the finalzer thread will scan and remove slots when a process is killed
The key idea is that instead of having a separate prcoess that manages, one process can decide to take the role of the daemon and manage shared handles
Delay the creation of shared handles, If there are no other processes we do not need it
Dick will update the io-layer coding conventions to match the rest of the Mono runtime.
The Gtk# TreeView/ListView is a capable widget, but the API is fairly complex for developers that want to get something quick on the screen.
We looked at various options:
At the time of this writing Duncan has proved that a short path to simplify the API is to use NodeStore and NodeView, you can see the details and a screenshot here.
Implementing a new treeview/listview is still an option, but it is not a priority at this point.
We also learned that we can do a lot of the databinding work for Gtk# in general, so we will likely add support to the standard Gtk# widgets to support the System.ComponentModel and allow for databinding.
We had a long discussion on Databinding widgetry. Today the System.Web namespace implements databinding functionality, but this is not in general available to Gtk# and we have to write new support for it in Windows.Forms.
We considered building it for Windows.Forms and then translating the expertise to Gtk# later. Mike felt that little of the code could be reusable, so we have decided for now that Mike will keep his focus on improving Gtk# and the Windows.Forms team will implement the databinding widgets on their own.
For the past few months after the 1.0 release, Mike has been updating Gtk# as part of a full API audit to ensure that the code produced by our binding generator is correct fixing problems as we go. So far the API audit has covered glib and gdk and gtk is mostly complete.
The current priorities are:
After this we will branch Gtk# for the new release. The new release of Gtk# moves the binding from wrapping Gnome 2.2 to wrapping the Gnome 2.6 APIs. This work has already been done by some folks in the community and most of the work is merging the gtk-sharp-2-4 branch into the HEAD branch.
In addition to these, we will be looking at adopting the System.ComponentModel for the core widgets (to support databinding) and to add support to this to our GtkTreeModel/TreeList (see separate section on this for details).
As for Pango, we believe that we will need to roll a hand-made binding for all of the components that the generator can not handle today.
Another post 1.0 feature is to look into CLS compliance for Gtk# as today Gtk# is not a CLS compliant class library.
Stetic is the new Gtk# GUI designer, written from scratch. Stetic is being developed by Dan Winship and he demoed his current designer. We will report more as progess is made on Stetic.
The goals of Stetic are to produce a modern GUI designer, learning from modern GUI designers and to provide good integration hooks with third-party IDEs (like MonoDevelop or Eclipse).
Work continues in this area. Martin got the C5 library to build and execute with Mono (the code has not been checked in yet, he called from his train in the middle of Colorado).
We have made a mental note to travel only by train from now on, as productivity of everyone who traveled by train tripled.
For now the focus will remain in feature completeness, and only afterwards we will look into performance tuning and memory footprint of generic classes.
Priorities: the team at Novell will work towards completion of the Windows.Forms API for the 1.0 profile wherever possible (things like ActiveX wont be supported). The default rendering engine for the Windows.Forms will be the built-in theme engine.
The team will not focus on third party theme integration (Gtk, WinXP UXTheme API, etc) nor will they develop a CoreGraphics/Cocoa driver for the first release of Windows.Forms. If third party developers implement drivers for these things we can ship them, but we will not be doing them ourselves.
The Windows.Forms implementation has exposed various bugs in System.Drawing and performance bottlenecks, these are being fixed.
Cairo non-anti-aliased straight line drawing remains a problem. Various different angles have been attemped, nothing really solid.
Progress is rapid and some external contributors are starting to pick up and assist us with the bug count.
System.Drawing in Mono is implemented P/Invoking into our implementation of GDI+ which in turn is implemented in terms of Cairo for doing the actual rendering. This means that we will be able to benefit from any hardware optimizations that are added to Cairo. Our Windows.Forms is implemented on top of System.Drawing.
We fixed a few issues with locks in System.Drawing this week, the locks were missplaced, and we have managed to remove most of them.
Neither Mainsoft or Novell are ready to start work on the .NET 2.0 profile, both are still working on completing pieces of 1.x, deal with scalability and performance issues. We agreed that it was a good use of our time to first fix the issues in the 1.x profile, and only then move forward.
Nonetheless we had a discussion on a path to implement the 2.0 features.
Some assemblies did not change a lot, or we work had already started and is pretty much complete. The System.Web.Services and System.Xml libraries are fairly complete for 2.0 (modulo the ongoing changes that will likely appear leading to the official release).
In the case of System.Web which is a huge pieces, we have decided that we will not necessarily implement the whole stack, but identify discrete pieces that can be implemented and fully supported and provide a roadmap for development as well as a document describing what is and what is not supported. The intention is to be able to ship the main features before we are able to complete all of the work required.
In the .NET world EnterpriseServices and System.Messaging implement a queue system and a transaction system, both missing from the Mono class libraries.
It would be possible to implement those APIs and rely on an external Java application to implement those (plenty of open source available in this area).
We discussed the possibility of instead of going down that path to look at Indigo as a substrate for implementing those two APIs.
Nobody is currently looking at this
Mainsoft described to us their setup for doing performance tuning and regression testing, we will replicate the same hardware setup in the Cambridge offices.
The goal of the shared meetings with Mainsoft was to improve the performance and reliability of the code that both of our projects share.
Mainsoft discussed a few problems today in Mono's libraries:
A general problem here is that too much copying is happening on each request.
Neale Ferguson from SoftwareAG came to discuss his S390 port. He is backporting the S390 port to the Mono 1.0 branch and is also approaching completion of the 64-bit port of Mono to the s390x.
The S390 port will work in 64 bit systems, but wont be able to take advantage of the extra address space.
Massimiliano has started a JIT port to the ARM cpus on his copious spare time. The ARM machine we got for him is smaller than a box of cigarettes, runs Linux and has a wireless interface. The AC power adaptor is larger than the machine itself.
Zoltan's AMD64 port contains SSE optimizations, once complete they will be back ported to the x86 port.