CMlyst 0.3.0 released

CMlyst is a Web Content Management System built using Cutelyst, it was initially inspired by Wordpress and then Ghost. So it's a mixture of both.

Two years ago I did it's first release, and since them I've been slowly improving it, it's been on production for that long providing www.cutelyst.org web site/blog. The 0.2.0 release was a silent one which marks the transition from QSettings storage to sqlite.

Storing content on QSettings is at first quite interesting since it's easy to use but it showed not suitable very fast, first it kept leaving .lock files, then it's not very fast to access so I had used a cache with all data, and a notifier updated that when something changed on the directory, but this also didn't properly triggered QFileSystemWatcher so once a new page was out the cache wasn't properly updated.

Once it was ported to sqlite, I decided to study how Ghost worked, this was mainly due many Qt/KDE developer switching to it. Ghost is quite simplistic, so it was very easy to try to provide something quite compatible with it, porting a Ghost theme to CMlyst requires very little changes due it's syntax being close to Grantlee/Django.

Due porting to sqlite it also became clear that an export/import tool was needed, so you can now import/export it in JSON format, pretty close to Ghost, actually you can even import all you Ghost pages with it, but the opposite won't work, and that's because we store pages as HTML not Markdown, my feeling about markdown is that it is simple to use, convenient to geeks but it's yet another thing to teach users which can simply use a WYSIWYG editor.

Security wise you need to be sure that both Markdown and HTML are safe, and CMlyst doesn't do this, so if you put it on production be sure that only users that know what they are doing use it, you can even break the layout with a not closed tag.

But don't worry, I'm working on a fix for this, html-qt is a WHATWG HTML5 specification parser, mostly complete, but the part to have a DOM, is not done yet, with it, I can make sure the HTML won't break layout and remove unsafe tags.

Feature wise, CMlyst has 80% of Ghost features, if you like it please help add missing features to Admin page.

Some cool numbers

Comparing CMlyst to Ghost can be trick, but it's interesting to see the numbers.

Memory usage:

  • CMlyst uses ~5MB
  • Ghost uses ~120MB

Requests per second (using the same page content)

  • CMlyst 3500/rps (production mode), 1108/rps (developer mode)
  • Ghost 100/rps (production mode)

While the RPS number is very different, on production you can use NGINX cache which would make the slow Ghost RPS not a problem, but that comes to a price of more storage and RAM usage, if you run on an AWS micro instance with 1GB of RAM this means you can have a lot less instances running at the same time, some simple math shows you could have 200 CMlyst instaces vs 8 of Ghost.

Try it!

https://github.com/cutelyst/CMlyst/archive/v0.3.0.tar.gz

Sadly it's also liked soon I'll be forking Grantlee, the lack of maintenance just hit me yesterday (when I was going to release this), Qt 5.7+ has changed QDateTime::toString() to include TZ data which broke Grantlee date filter which isn't expecting that, so I had to do a weird workaround marking date as local to avoid the extra information.

Cutelyst 1.7.0 released! WebSocket support added.

Cutelyst the Qt / C++11 web framework just got another important release.

WebSocket support is probably a key feature to have on a modern web framework, Perl Catalyst doesn't look like it wasn't designed with it in mind, the way I found to do WS there wasn't intuitive.

Qt has WebSocket support via QWebSockets module, which includes client and server implementation, and I have used it for a few jobs due lack of support inside Cutelyst. While looking at it's implementation I realized that it wouldn't fit into Cutelyst, and it even had a TODO due a blocking call, so no go.

I've looked then at uWSGI implementation and while it is far from being RFC compliant or even usable (IMO) the parser was much simpler, but it required that all data was available to parse. Since messages can be split into frames (really 63bits to address payload was probably not enough) and each frame has a bit to know if the message is over, which uWSGI simply discards, thus any fragmented message can't be merged back.

WebSockets in Cutelyst have an API closer to what QWebSocket has, but since we are a web framework we have more control over things, first on your websocket endpoint you must call:

c->response->webSocketHandshake()

If the return is false just return from the method to ignore the request or it might be the case that you want the same path to show a chat HTML page, this can happen if the client didn't sent the proper headers or if the Engine backend/protocol doesn't support WebSockets. If true you can connect to the Request object and get notified about pong, close, binary and text (UTF-8) messages and frames, a simple usage is like:

connect(req, &Request::webSocketTextMessage, [=] (const QString &msg) {

  qDebug() << "Got text msg" << msg;

  response->webSocketTextMessage(msg);

});

This will work as a Echo server, the signals also include Cutelyst::Context in the case you are connecting to a slot.

Cutelyst implementation is non-blocking, a little faster than QWebSockets, uses less memory and passes all http://autobahn.ws/testsuite tests, it doesn't support compression yet.

systemd socket activation support was added, this is another very cool feature although it's still missing a way to die the application on idle. On my tests with systemd socket activation, Cutelyst started so fast that the first request that starts it takes only twice the time it would take if Cutelyst was already running.

This release also include:

  • Fixes for using cutelyst-wsgi talking FastCGI to Apache
  • Workaround QLocalServer closing down when an accept fail (very common when used on prefork/threaded servers)
  • Fixed when using cutelyst-wsgi on threaded mode, POST requests would use the same buffer corrupting or crashing.
  • A few other fixes.

Due WebSockets plans for Cutelyst 2 are taking shape, but I'll probably try to add HTTP2 support first since the multiplexing part is might make Cutelyst Engine quite complex.

Have fun https://github.com/cutelyst/cutelyst/archive/v1.7.0.tar.gz

Cutelyst benchmarks on TechEmpower round 14

The new TechEmpower benchmarks results are finally out, round 14 took almost 6 months to be ready, and while a more frequent testing is desirable this gave us plenty of time to improve things.

Last round a bug was crashing our threaded benchmarks (which wasted time restarting), besides having that bug fixed the most notably changes that improved our performance were:

  • Use of a custom Epoll even dispatcher instead of default glib one (the tests without epoll in it's name are using the default glib)
  • jemalloc, this brought the coolest boost, you just need link to it (or use LD_PRELOAD) and memory allocation magically get's faster (around 20% for the tests)
  • CPU affinity that helps the scheduler to keep each thread/process pinned to a CPU core.
  • SO_REUSEPORT which on Linux helps to get connections evenly distributed among each thread/process
  •  Several code optimizations thanks to perf

I'm very glad with our results, Cutelyst managed to stay with the top performers which really pays off all the hard work, it also provide us with good numbers to show for people interested in.

Check it out https://www.techempower.com/benchmarks/

Cutelyst 1.6.0 released, to infinity and beyond!

Once 1.5.0 was release I thought the next release would be a small one, it started with a bunch of bug fixes, Simon Wilper made a contribution to Utils::Sql, basically when things get out to production you find bugs, so there were tons of fixes to WSGI module.

Then TechEmpower benchmarks first preview for round 14 came out, Cutelyst performance was great, so I was planning to release 1.6.0 as it was but second preview fixed a bug that Cutelyst results were scaled up, so our performance was worse than on round 13, and that didn't make sense since now it had jemalloc and a few other improvements.

Actually the results on the 40+HT core server were close to the one I did locally with a single thread.

Looking at the machine state it was clear that only a few (9) workers were running at the same time, I then decided to create an experimental connection balancer for threads. Basically the main thread accepts incoming connections and evenly pass them to each thread, this of course puts a new bottleneck on the main thread. Once the code was ready which end up improving other parts of WSGI I became aware of SO_REUSEPORT.

The socket option reuse port is available on Linux >3.9, and different from BSD it implements a simple load balancer. This obsoleted my thread balancer but it still useful on !Linux. This option is also nicer since it works for process as well.

With 80 cores there's still the change that the OS scheduler put most of your threads on the same cores, and maybe even move them when under load. So an option for setting a CPU affinity was also added, this allows for each work be pinned to one or more cores evenly. It uses the same logic as uwsgi.

Now that WSGI module supported all these features preview 3 of benchmarks came out and the results where still terrible... further investigation revealed that a variable supposed to be set with CPU core count was set to 8 instead of 80. I'm sure all this work did improve performance for servers with a lots of cores so in the end the wrong interpretation was good after all :)

Preview 4 came out and we are back to the top, I'll do another post once it's final.

Code name "to infinity and beyond" came to head due scalability options it got :D

Last but not least I did my best to get rid of doxygen missing documentation warnings.

Have fun https://github.com/cutelyst/cutelyst/archive/v1.6.0.tar.gz

Cutelyst 1.5.0 released, I18N and HTTPS built-in

Cutelyst the C++/Qt web framework just got a new stable release.

Right after last release Matthias Fehring made another important contribution adding support for internationalization in Cutelyst, you can now have your Grantlee templates properly translated depending on user setting.

Then on IRC an user asked if Cutelyst-WSGI had HTTPS support, which it didn't, you could enable HTTPS (as I do) using NGINX in front of your application or using uwsgi, but of course having that build-in Cutelyst-WSGI is a lot more convenient especially since his use would be for embedded devices.

Cutelyst-WSGI also got support for --umask, --pidfile, --pidfile2 and --stop that will send a signal to stop the instance based on the pidfile provided, better documentation. Fixes for respawning and cheaping workers, and since I have it now on production of all my web applications FastCGI got some critical fixes.

The cutelyst command was updated to use WSGI library to work on Windows and OSX without requiring uwsgi, making development easier.

www.cutelyst.org got Cutelyst logo, and an updated CMlyst version, though the site still looks ugly... .

Download here: https://github.com/cutelyst/cutelyst/archive/v1.5.0.tar.gz .

Have fun! .