Daniel NicolettiBrazil, São Paulo http://dantti.wordpress.com 26 Posts
Right on my first blog post about Cutelyst users asked me about more "realistic" benchmarks and mentioned TechEmpower benchmarks. Though it was a sort of easy task to build the tests at that time it didn't seem to be useful to waste time as Cutelyst was moving target.
Around 0.12 it became clear that the API was nearly freezing and everyone loves benchmarks so it would serve as a "marketing" thing. Since the beginning of the project I've learned lots of optimizations which made the code get faster and faster, there is still room from improvement, but changes now need to be carefully measured.
Cutelyst is the second Web Framework on TechEmpower benchmarks that uses Qt, the other one being treefrog, but Qt's appearance was noticed on their blog:
The cutelyst-thread tests suffered from a segfault (that was restarted by master process), the fix is in 1.0.0 release but it was too late for round 13, so round 14 it will get even better results.
If you want to help MongoDB/Grantlee/Clearsilver tests are still missing :D
Cutelyst the Qt web framework just reached it's first stable release, it's been 3 years since the first commit and I can say it finally got to a shape where I think I'm able to keep it's API/ABI stable. The idea is to have any break into a 2.0 release by the end of next year although I don't expect many changes as the I'm quite happy with it's current state.
The biggest change from the last release is the SessionStoreFile class, it initially used QSettings for it's simplicity but it was obvious that it's performance would be far from ideal, so I replaced QSettings with a plain QFile and QDataStream, this produced smaller session files and made the code twice as fast, but profiling was showing it was still slow because it was writing to disk multiple times on the same request. So the code was changed to merge the changes and only save to disk when the Context get's destroyed, on my machine the performance went from 3,5k to 8.5k on read and writes and 4k to 12k on reads, this is on the same session id, should probably be faster with different ids. This is also very limited by the disk IO as we use a QLockFile to avoid concurrency, so if you need something faster you can subclass SessionStore and use Sql or Redis...
Besides that the TechEmpower 13th preview rounds showed a segfault in the new Cutelyst WSGI in threaded mode, a very hard to reproduce but with an easy fix. The initial fix made the code a bit ugly so I searched to see if XCode clang got an update on thread_local feature and finally, XCode 8 has support for it, if you are using XCode clang now you need version 8.
Many other small bugs also got in and API from 0.13.0 is probably unchanged.
Now if you are a distro packager please :D package it, or if you are a dev and was afraid of API breaks keep calm and have fun!
Cutelyst the Qt web framework just got a new release, 0.13.0.
A new release was needed now that we have this nice new logo.
Special thanks to Alessandro Longo (Alex L.) for crafting this cute logo, and a cool favicon for Cutelyst web site.
But this release ain't only about the logo, it's full of cool things:
When I started Cutelyst a simple developer Engine (read HTTP engine) was created, it was very slow and mostly an ugly hackery but helped work on the APIs that matter, I then took a look at uWSGI due some friend saying it was awesome and it was great to be able to deal with many protocols without the hassled of writing parsers for them.
Fast forwarding to 0.12.0 release and I started to feel that I was reaching a limit on Cutelyst optimizations and uWSGI was holding us back, and it wasn't only about performance, memory usage (scalability) was too high for something that should be rather small, it's written in C after all.
It also has a fixed number of requests it can take, if you start it with 5 threads or process it's 5 blocking clients that can be processed at the same time, if you use the async option you then have a fixed number of clients per process, 5 process * 5 async clients = 25 clients at the same time, but this 5 async clients are always pre-allocated which means that each new process will also be bigger right from launch.
Think now about websockets, how can one deal with 5000 simultaneous clients? 50 process with async = 100? Performance on async mode was also slower due complexity to deal with them.
So before getting into writing an alternative to uWSGI in Cutelyst I did a simple experiment, asked uWSGI to load a Cutelyst app and fork 1000 times and wrote a simple QCoreApplication that would do the same, uWSGI used > 1GB of RAM and took around 10s to start, while the Qt app used < 300MB of RAM and around 3s. So ~700MB of RAM is a lot of RAM and that was enough to get me started.
Cutelyst-wsgi, is born, and granted the command line arguments are very similar to uWSGI and I also followed the same separation between socket and protocol handling, of course in C++ things are more reusable, so our Protocol class has a HTTP subclass and in future will have FastCGI and uWSGI ones too.
Did I say uWSGI before 2.1 doesn't support keep-alive? And that 2.1 is not released nor someone knows when it will? Cutelyst-wsig supports keep-alive, http pipelining, is complete async and yes, performs a little better. If you put NGINX in front of uWSGI you can get keep alive support, but guess what? the uwsgi protocol closes the connection between the front server so it's quite hard to get very high speeds. Preliminary results of TechEmpower Benchmarks #13 showed Cutelyst hitting these limits as others frameworks were using keep-alive properly.
Thanks to this new Engine the Engine API got several improvements and is quite stable now. Besides it a few other important changes were made as well:
- Change internals to take advantage of NRVO (named return value optimization)
- Improved speed of Context::uriFor() making Cutelyst now require Qt 5.6 due a behavior change in QUrl
- Improved speed and memory usage of Url query parser 1s faster in 1m iterations, using QByteArray::split() is very convenient but it allocates more memory and a QList for the results, using ::indexOf() and manually getting the parts is both faster and more memory efficient but yes, this is the optimization we do in Cutelyst::Core and that makes a difference, in application code the extra complexity might not worth it.
- C++ for ranged loops, all our Q_FOREACH & friends where replaced with for ranged loops
- Use of new reverse and equal_range iterators
- Use QHash for storing headers, this was done after several benchmarks that showed QHash was faster for all common cases, namely if it keept the values() in order like QMap it would be used in other places as well
- Replaced most QList with QVector, and internally std::vector
- Multipart/form-data got faster, it doesn't seek() anymore but requires a not sequential QIODevice as each Upload object point to parts of the body device.
- Add a few more unit tests.
Thanks to the above the core library size is also a bit smaller, ~640KB on x64.
I was planning to do a 1.0 after 0.13 but with this new engine I think it's better to have a 0.14 version, and make sure no more changes in Core will be needed for additional protocols.
Download here enjoy!
Cutelyst a web framework built with Qt is closer to have it's first stable release, with it becoming 3 years old at the end of the year I'm doing my best to finally iron it to get an API/ABI compromise, this release is full of cool stuff and a bunch of breaks which most of the time just require recompiling.
For the last 2-3 weeks I've been working hard to get most of it unit tested, the Core behavior is now extensively tested with more than 200 tests. This has already proven it's benefits resulting in improved and fixed code.
Continuous integration got broader runs with gcc and clang on both OSX and Linux (Travis) and with MSVC 12 and 14 on Windows (Appveyor), luckily most of the features I wanted where implemented but the compiler is pretty upsetting. Running Cutelyst on Windows would require uwsgi which can be built with MinGW but it's in experimental state, the developer HTTP engine, is not production ready so Windows usefulness is limited at the moment.
One of the 'hypes' of the moment is non-blocking web servers, and this release also fixes this so that uwsgi --async <number_of_requests> is properly handled, of course there is no magic, if you enable this on blocking code the requests will still have to wait your blocking task to finish, but there are many benefits of using this if you have non-blocking code. At the moment once a slot is called to process the request and say you want to do a GET on some webservice you can use the QNetworkAccessManager do your call and create a local QEventLoop so once the QNetworkReply finish() is emitted you continue processing. Hopefully some day QtSql module will have an async API but you can of course create a Thread Queue.
A new plugin called StatusMessage was also introduced which generates an ID which you will use when redirecting to some other page and the message is only displayed once, and doesn't suffer from those flash race conditions.
The upload parser for Content-Type multipart/form-data got a huge boost in performance as it now uses QByteArrayMatcher to find boundaries, the bigger the upload the more apparent is the change.
Chunked responses also got several fixes and one great improvement which will allow to use it with classes like QXmlStreamWriter by just passing the Response class (which is now a QIODevice) to it's constructor or setDevice(), on the first write the HTTP headers are sent and it will start creating chunks, for some reason this doesn't work when using uwsgi protocol behind Nginx, I still need to dig and maybe disable the chunks markup depending on the protocol used by uwsgi.
A Pagination class is also available to easy the work needed to write pagination, with methods to set the proper LIMIT and OFFSET on Sql queries.
Benchmarks for the TechEmpower framework were written and will be available on Round 14.
Last but not least there is now a QtCreator integration, which allows for creating a new project and Controller classes, but you need to manually copy (or link) the qtcreator directory to ~/.config/QtProject/qtcreator/templates/wizard.
As usual many bug fixes are in.
Help is welcome, you can mail me or hang on #cutelyst at freenode.
Cutelyst the Qt web framework just got a new release, I was planning to do this a while back, but you know we always want to put in a few more changes and time to do that is limited.
I was also interested in seeing if the improvements in Qt 5.6 would result in better benchmark tests but they didn't, the hello world test app is probably too simple for the QString improvements to be noticed, a real world application using Grantlee views, Sql and more user data might show some difference. Still, compared to 0.10.0 Cutelyst is benchmarking the same even tho there were some small improvements.
The most important changes of this release were:
- View::Email allows for sending emails using data provided on the stash, being able to chain the rendering of the email to another Cutelyst::View so for example you can have your email template in Grantlee format which gets rendered and sent via email (requires simple-mail-qt which is a fork of SmtpClient-for-Qt with a sane API, the View::Email hides simple-mail-qt API)
- Utils::Sql provides a set of functions to be used with QSql classes, most importantly are serializing QSqlQuery to QVariantList of QVariantMap (or Hashes), allowing for accessing the query data on a View, and the preparedSqlQuery() method which comes with a macro CPreparedSqlQuery, that is a lambda that keeps your prepared statement into a static QSqlQuery, this avoids the need of QSqlQuery pointers to keep the prepared queries around (which are a boost for performance)
- A macro CActionFor which resolves the Action into a static Action * object inside a lambda removing the need to keep resolving a name to an Action
- Unit tests at the moment a very limited testing is done, but Header class has nearly 100% of coverage now
- Upload parser got a fix to properly find the multipart boundary, spotted due Qt client sending boundaries a bit different from Chrome and FF, this also resulted in the removal of QRegularExpression to match the boundary part on the header with a simple search that is 5 times faster
- Require Qt 5.5 (this allows for removal of workaround for QJson::fromHash and allows for the use of Q_ENUM and qCInfo)
- Fixed crashes, and a memory leak is Stats where enabled
- Improved usage of QStringLiteral and QLatin1String with clazy help
- Added CMake support for setting the plugins install directory
- Added more 'const' and 'auto'
- Removed uwsgi --cutelyst-reload option which never worked and can be replaced by --touch-reload and --lazy
- Improvements and fixes on cutelyst command line tool
- Many small bugs fixed
The Cutelyst website is powered by Cutelyst and CMlyst which is a CMS, at the moment porting CMlyst from QSettings to sqlite is on my TODO list but only when I figure out why out of the blue I get a "database locked" even if no other query is running (and yes I tried query.finish()), once I figure that out I'll make a new CMlyst release.