Status, progress, cooperation

Hello, i started communication with Sam on Github about cooperation on this project, you can read it there (https://github.com/ruby-bench/ruby-bench/issues/22#issuecomment-61726875). <- You can also read about me here

For start i just got few questions

  • About the staging app, which build / repo it uses ? The app that i pulled from github repo acts differently as was written at API/UI staging app on Heroku thread. The stagging one uses javascript for displaying graphs. So the questions are what are the differences between code on staging and code in official repo.

  • I saw lazywei started repo to benchmark ruby using docker. Sorry i was not able to test review or test the code by now, so i just want to ask if you know at what state it is now.

  • Also what are you looks for future, and your opinion on my idea of testing also different implementations of Ruby as i wrote at issue at github.

As far as I am totaly open-minded by this point you may tweak my direction so I can fit more of your need with this project. I still need to read something more about Benchmarking Discourse locally so i know more about it. But as i read at your page Call for Long time Ruby benchmark there is plenty of tests to use.

PS: Sorry for non-clickable links, i was only allowed to post 2 of them :smile:

I am really not sure perhaps @andypike can help with that.

Again, not sure about this I think @lazywei will need to chime in.

I think we need something super basic and public out there visible to the world, a lot of what caused this to go dormant is over-ambition, something needs to be out there. Sure it would be nice to test different implementations and such but the first step is having real data out there.

@sam - Well, i took my bachelor thesis with Ruby Benchmarking, so by now i think i can make you some background test suit. My aim was on different Ruby implementations so Rubinius and JRuby for sure too. I gonna make test-suite based on docker and i thought i can push the stats to your UI. Also I am sure, I can make some tweaks with UI and Graph displaying.

@andypike - Will you be able to support me the content of Heroku staging ? Its not the same as on github.

@lazywei - Have you any plans , with your docker testing ? I am gonna build something similar but with more versions and different tests.

hmmā€¦ I think we can work on ā€œrunning more specific testā€. Currently, ruby-bench-docker can only run the simple ā€œbuilt-inā€ test in discourse, and the built-in benchmarks in Ruby. We may work on 1. more aspect test 2. more detail report.

@lazywei - Okey, i will take a look on status of your repo, also I am planing to add more rubies supported, as I mentioned, at least JRuby and Rubinius, but i think it would be interested also looking on something like TinyRB. I am planning to add ruby-benchmark-game to test sets too, also i have some test for parallelism.

I will fork your repo and play a little bit with it.

We can figure out some common metrics for benchmarks and some UI tweaks so we can compare stats and make some results :smile:

I saw you were benchmarking time outside container. I would wrap benchmarks by runners which will use Ruby Benchmark lib for example and then run following tests calculating just the time inside container ( we dont know but runing a cointainer may slow down something ).

Of course that wont be possible for start up tests. What you think @lazywei ?

I think you are right. We can surely move the test into container to save time :smile:

Well i just made a small test run over night and there are some results of MRI version only. It was tested on ā€˜ruby benchmark gameā€™ benchmarks.

Open as csv - http://pastie.org/private/lbwvnzgarb5k6f5ngcmpea

Iā€™ve actually started from scratch and have came up with an MVP over at https://railsbench.herokuapp.com/. Currently, Iā€™m running the benchmarks on a bare metal server from Softlayer (sponsors welcome) and would love to have more collaborators on it. I really want this to take off and have plenty of time (Iā€™m a student so time is on my side) to work on the project. Below is a brief overview of what Iā€™ve done so far.

Overview

The following is a rough overview of how I implemented the application:
The Web UI has a Github Event Handler which receives the commits pushed to the repository through a webhook. Once the webhook has been received, the application run jobs against the particular commit which executes the benchmarks on a bare metal server through SSH.
In order to have consistent benchmarks, Iā€™m renting the cheapest bare metal server from Softlayer. Iā€™m not sure if we really need a bare metal server but Sam Saffron mentioned something about it in his blog post. The benchmarks are executed in Docker containers which is similar to what Bert did for GSoC and inline with the discussion here Runner on Docker.

Rails Benchmark

Rails Benchmark URL: https://railsbench.herokuapp.com/tgxworld/rails

For the Rails Benchmark, I forked ko1-test-app by Aaron and modified it to post the results of objects allocation for an index action which fetches and renders 100 users and their attributes. I read the previous thread and felt that creating our own app is the cheapest way to get things going. Discourse is currently running on Rails master but it will be tough if weā€™re benchmarking ā€œtwo moving targetsā€(~ Matthew). Perhaps we might be able to use a fixed code base of Discourse and run the benchmarks against it.

Ruby Benchmark

Ruby Benchmark URL: https://railsbench.herokuapp.com/tgxworld/ruby

For the Ruby Benchmark, I used the existing ruby benchmarks. Not sure if this is relevant to the Rails Community but I thought it would be good to implement it anyway. Currently Iā€™m running the benchmarks once as the whole suite takes close to 6 mins on my machine.

Next Steps

What other metrics do we need to track? (ko1-test-app has scripts for GC, Number of Request/sā€¦)
Setup webhook on Rails Repo so that I donā€™t have to track the changes from a fork.
Thoughts and feedback in general?
More eyes to look at my code :stuck_out_tongue:
Some sort of variation tracking system to notify contributors of anomalies
Per release benchmark of major apps?
Improve design

@sam Would love to have your input on this :smile:

@tgxworld - hey did u checked my results, you can also find benchmark suit in docker on https://github.com/Ryccoo/docker-ruby-benchmark. I am currently working on it as my Bachelor Thesis. I will add more tests to it later.

Just one thing is unclear to me, on which repo are the webhooks that trigger benchmarks and why ?

And the Web UI is kind of missleading to me, i dont know what you are tring to say by the results and what the nubmers mean.

Hi @richard_ludvigh :smiley:

Just one thing is unclear to me, on which repo are the webhooks that trigger benchmarks and why ?

The Github webhooks are being triggered by me manually daily from a fork of the repo. This will be so until Iā€™m able to get the Rails core and Ruby core to add the webhooks into the official repo.

And the Web UI is kind of missleading to me, i dont know what you are tring to say by the results and what the nubmers mean.

Ah yes. This part will definitely need more work to make it clear. For now I just wanted a quick prove of concept before more work is being done. Anyway for Rails, the numbers represent object allocations for an index action which fetches a 100 users from the DB and renders them in a list. For Ruby, they numbers represent the time in secs for each benchmark to run.

Ah ok. I just had a brief look into your repository. I think the main difference for what Iā€™m working on is that Iā€™m only interested in the performance tracking of Ruby Trunk and Rails Master. In summary, Iā€™m running the benchmarks on a per commit basis.

@tgxworld - yeah but i think our ideas can merge somewhere between, my plan is to test benchmarks like ā€˜ruby benchmark gameā€™, ruby official benchmarks, or cursera benchmarks on different versions of ruby (i am managing the dockerfiles with ruby versions manualy now - they dont come out that often)

My idea was to present the same benchmark on diferent versions and get the knowledge of how and why is the speed of ruby code changing (yeah sometimes ruby-1.9.2 was much faster in my tests than ruby 2.1.2 :slight_smile: mainly because of security issues).

Also i saw u trigger the tests on each commit, which can be unnecessary sometimes, as i saw, there was a commit where was just small typo in comment fixed and some benchmark went from 3.00 to 2.90 which i think should not happen :smile:

I think we can workoff something together even as two separate platforms ( i will just use ur UI but present other results )

@richard_ludvigh

My idea was to present the same benchmark on diferent versions and get the knowledge of how and why is the speed of ruby code changing (yeah sometimes ruby-1.9.2 was much faster in my tests than ruby 2.1.2 smile mainly because of security issues).

Yup! I think this is one of the objectives we want to achieve. Upgrade to Ruby 2.X to get XX% of performance improvement.

Also i saw u trigger the tests on each commit, which can be unnecessary sometimes, as i saw, there was a commit where was just small typo in comment fixed and some benchmark went from 3.00 to 2.90 which i think should not happen smile

Yea :sweat_smile: Iā€™m skipping the benchmarks when [CI SKIP] is present though.

@tgxworld

As i checked now, i have 13 tests only now in suite and for all versions i have (12 currently - 10 MRI + jruby, rubinius) It takes about 1.5 - 2 days i think to finish. I am running each tests 10 times.

I will talk to my bachelor thisis leader (the thesis and leader is from Red Hat), so maybe i can get some machines on openshift where i can run my tests with no distributions.

I will post them here then, also i have to make other tests run but i am not sure, as Sam ( i thing ) told that there are still troubles running cursera benchmarks on rubinius for example

I can think of 2 critical things here.

  1. There is piles of noise in the UI its hard to follow what everything means

    • clean up ui so commit messages are not so prominent in page
    • add a note about what the graph means, do you track multiple metrics?
  2. We desperately need a backfill of data, even picking one build every week for the last 2 years would be good.

Really happy you are making progress here, if you can get this to a nice shape I am sure I can get funding for hosting sorted out.

@sam - U got time to look at my suit ? ( https://github.com/Ryccoo/docker-ruby-benchmark ) I will probably make second domain for it as it does not concern on commits that much but version released.

I am just waiting for some metal to run the tests on, as it take few days to finish. Also i want to ask if you have some suggestions on tests to run ?

@tgxworld i we can split the job in benchmarking, so you can focus mainly on rails benchmarking and i will take care of ruby benchmarking. Also i think it would be good to tests some other implementations too, some without GIL as they can support true concurency. Maybe it will end up by giving some usefull results :smile:

Hello. IRFY author here. (http://www.isrubyfastyet.com/)

I would love to have a single, useful, site where people can go for both high-level and detailed Ruby and Rails performance information. If we could have that, then Iā€™d gladly surrender the IRFY domain. We donā€™t need a plethora of semi-similar projects.

@tgxworld Looks pretty good. As Sam said earlier, it seems that a lot of projects have died from over-ambition. IMHO, after a bit of UI work, you can ship your project: since there are so many Ruby benchmarks, you may want some page with a bunch of tiny graphs that you can click into. Something like: http://speed.pypy.org/timeline/#/?exe=3,6,1,5&base=2+472&ben=grid&env=1&revs=200&equid=off

A homepage for casual visitors would also be nice. (e.g. http://speed.pypy.org/, arewefastyet.com)

Iā€™m willing to help out a bit. What are your next steps?

@richard_ludvigh Why does your suite take so long? If your suite takes multiple days to run, it can make iterating very hard. I had a goal with IRFY that it had to finish overnight. (Granted, IRFY is tightly coupled because itā€™s a mountain of spaghetti Ruby and Bash.)

Yup lots more work to do for the UI. Thanks for the examples!

Seems like the main concern now is the presentation of the data which Iā€™ll work on this week. Other than that, Iā€™m currently checking with the Rails Core to find out what they need or rather what needs to be tracked. Hmm Iā€™m open to options.

@brianhempel - It need couple of days to run each benchmark 10 times on 12 different version. After we have this results, we will need only 1/12 of that time to test one new version after release :smile: ( some benchs takes like 3 mins, now do id 10 times for 12 versions and u got 6 hours already )

Also the more test runs, the more accurate results :slight_smile: