[PageOneX] [dev] Urgent: PageOneX is down

pablo rey pablo at basurama.org
Thu Jan 23 16:31:24 EST 2014


Thanks Rafa for your detailed description!

What a coincidence, PageOneX just crashed again minutes ago :-/
Perfect moment to try to understand what is going on.




On Thu, Jan 23, 2014 at 8:37 PM, Rafael Porres Molina <rporres at gmail.com>wrote:

> Hi,
>
> I don't know if I'm going to be able to make tonight.
>
> Anyway this is what I saw talking with Pablo the other day.
>
> - The instance where Rails and MySQL is running seems undersized
> memory-wise.
>
> numeroteca at li563-7:~$ head -1 /proc/meminfo
> MemTotal:        1104816 kB
>
> which is around 1 Gb of memory.
>
> - Just before hitting the error in the video Pablo sent, you can see that
> CPU I/O wait spiked, and the kernel swapping thread was most consuming
> process that top was showing. That indicates that the system was trying to
> page out memory pages to swap. As to why the browser returns a 500 error,
> it could be that the ruby process dies or it hits a response timeout for
> the Apache request (I see that Rails runs behind Apache, but I haven't had
> the time to take a look to the configuration).
>
> - As for the MySQL crashes, they also seem related to memory. Mysql error
> logs show errors like the following:
>
> 140116  9:40:01 [Note] Plugin 'FEDERATED' is disabled.
> 140116  9:40:01 InnoDB: The InnoDB memory heap is disabled
> 140116  9:40:01 InnoDB: Mutexes and rw_locks use GCC atomic builtins
> 140116  9:40:01 InnoDB: Compressed tables use zlib 1.2.3.4
> 140116  9:40:01 InnoDB: Initializing buffer pool, size = 128.0M
> InnoDB: mmap(137363456 bytes) failed; errno 12
> 140116  9:40:01 InnoDB: Completed initialization of buffer pool
> 140116  9:40:01 InnoDB: Fatal error: cannot allocate memory for the buffer
> pool
> 140116  9:40:01 [ERROR] Plugin 'InnoDB' init function returned error.
> 140116  9:40:01 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE
> failed.
> 140116  9:40:01 [ERROR] Unknown/unsupported storage engine: InnoDB
> 140116  9:40:01 [ERROR] Aborting
>
> that indicates that it crashed while trying to allocate memory for the
> innodb buffer pool. As it is not explicitly configured, in mysql 5.5 the
> default is 128Mb, which is not much unless the instance where MySQL is
> running is so short in memory.
>
>
> http://dev.mysql.com/doc/refman/5.5/en/innodb-parameters.html#sysvar_innodb_buffer_pool_size
>
> The easiest thing to do is to try to get more memory for the instance
> where pageonex is running. How much? Well, as much as you can afford :)
>
> Regards,
>
> Rafa
>
>
>
> 2014/1/23 pablo rey <pablo at basurama.org>
>
>> Hi there,
>> Reminder that we are meeting in 2 and a half hours (4.30pm EST) today!
>> Send me your gmail account so that I can add you to the hangout.
>>
>> The idea is to brainstorm and understand why PageOneX is crashing when
>> making long and short threads (500 internal server error) #179<https://github.com/numeroteca/pageonex/issues/179>+ why is the mySQL crashing as well.
>>
>> It would be great to understand how the server is built.
>> After a short talk with Rafa Porres, we figured out that is probably
>> related to a lack of memory (now we have 1Gb).
>>
>> Folks in Mexico made their own PageOnex deployment and research<http://estadodelperiodismo.articulo19.org/>and they also reported related issues, please, Alecs, post your problems in
>> Github and share them through this list!
>>
>> See you in a while,
>> pablo
>>
>>
>> On Tue, Jan 21, 2014 at 6:27 PM, pablo rey <pablo at basurama.org> wrote:
>>
>>> Reminder for everyone interested: this Thursday Jan 23rd 4.30pm EST
>>> (10.30pm CET) we have the session to try to solve different problems with
>>> the server crashing at PageOneX.
>>>
>>> I made a video showing how it goes to "500 internal server error" when
>>> editing a thread and adding one extra day, including logs and top:
>>> http://www.youtube.com/watch?v=wuBMPu7Hty0
>>>
>>>
>>> On Thu, Jan 9, 2014 at 12:08 AM, pablo rey <pablo at basurama.org> wrote:
>>>
>>>> Great Rafa, I'll write you off the list. If anyone is interested to
>>>> join any of the meetings, shoot me an email!
>>>>
>>>>
>>>> On Wed, Jan 8, 2014 at 11:55 PM, Rafael Porres Molina <
>>>> rporres at gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I'd be happy to join if I can be of any help.
>>>>>
>>>>> Pablo: We could arrange a previous hangout between you and me so that
>>>>> you can give me some context to be more helpful.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Rafa
>>>>>
>>>>>
>>>>> 2014/1/8 Edward Platt <elplatt at media.mit.edu>
>>>>>
>>>>>> Works for me, talk then!
>>>>>>
>>>>>>
>>>>>> On Wed, Jan 8, 2014 at 5:21 PM, pablo rey <pablo at basurama.org> wrote:
>>>>>>
>>>>>>> Given the 6 hours of difference Jan 23rd 4.30pm EST (10.30pm CET,
>>>>>>> Spain) is good for me.
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Jan 8, 2014 at 11:17 PM, Edward Platt <elplatt at media.mit.edu
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> Woks for me, my day should be open so far.  What time's good for
>>>>>>>> you?
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Jan 8, 2014 at 5:13 PM, pablo rey <pablo at basurama.org>wrote:
>>>>>>>>
>>>>>>>>> Let's set another date, then. Thursday Jan 23rd?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Jan 8, 2014 at 9:05 PM, Edward Platt <
>>>>>>>>> elplatt at media.mit.edu> wrote:
>>>>>>>>>
>>>>>>>>>> Touching base sounds good.  After Jan 22 would be best.
>>>>>>>>>>
>>>>>>>>>> -Ed
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Jan 8, 2014 at 2:58 PM, pablo rey <pablo at basurama.org>wrote:
>>>>>>>>>>
>>>>>>>>>>> Thanks Ed,
>>>>>>>>>>> I had to restart mysql again minutes ago :-/
>>>>>>>>>>> The problem with this issue and specially with #179<https://github.com/numeroteca/pageonex/issues/179>is that early adopters do not come back after a first failed attempt to
>>>>>>>>>>> create a thread.
>>>>>>>>>>>
>>>>>>>>>>> I would suggest to set up a google hangout during IAP (=January
>>>>>>>>>>> for non MIT folks!) to brainstorm together on this.
>>>>>>>>>>> What do you think? anyone interested?
>>>>>>>>>>> Which other sites are hosted in that server?
>>>>>>>>>>>
>>>>>>>>>>> How about next Monday Jan 13th at 4.30pm EST (10.30pm CET,
>>>>>>>>>>> Spain)?
>>>>>>>>>>>
>>>>>>>>>>> Best,
>>>>>>>>>>> p
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Dec 26, 2013 at 2:31 AM, Edward Platt <
>>>>>>>>>>> elplatt at media.mit.edu> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Thanks for the update Pablo.  I don't see anything telling in
>>>>>>>>>>>> the logs you've sent.  The mysql problems may be unrelated to PageOneX as
>>>>>>>>>>>> the server is shared.  It's also possible that the performance problems we
>>>>>>>>>>>> were seeing before are somehow causing this.  I've been working out how to
>>>>>>>>>>>> best fix those, but there doesn't seem to be a quick fix.  It's on our
>>>>>>>>>>>> radar!
>>>>>>>>>>>>
>>>>>>>>>>>> -Ed
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Dec 25, 2013 at 4:06 PM, pablo rey <pablo at basurama.org>wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> I had to restart the mysql again today!
>>>>>>>>>>>>>
>>>>>>>>>>>>> This time it crashed after:
>>>>>>>>>>>>> Started GET "/about" for 91.206.178.40 at 2013-12-24 23:20:01
>>>>>>>>>>>>> +0000
>>>>>>>>>>>>> Processing by HomeController#about as HTML
>>>>>>>>>>>>>   Rendered home/about.html.erb within layouts/application
>>>>>>>>>>>>> (39.1ms)
>>>>>>>>>>>>> Completed 200 OK in 340ms (Views: 268.5ms | ActiveRecord:
>>>>>>>>>>>>> 0.0ms)
>>>>>>>>>>>>> Started GET "/numeroteca/corrupcion-espana-diciembre-2013/"
>>>>>>>>>>>>> for 91.206.178.40 at 2013-12-24 23:20:02 +0000
>>>>>>>>>>>>> Processing by CodingController#display as HTML
>>>>>>>>>>>>>   Parameters: {"username"=>"numeroteca",
>>>>>>>>>>>>> "thread_name"=>"corrupcion-espana-diciembre-2013"}
>>>>>>>>>>>>>   Rendered coding/_coding_footer.html.erb (0.3ms)
>>>>>>>>>>>>>   Rendered coding/display.html.erb within layouts/application
>>>>>>>>>>>>> (5064.0ms)
>>>>>>>>>>>>> Completed 200 OK in 5614ms (Views: 2025.2ms | ActiveRecord:
>>>>>>>>>>>>> 3365.2ms)
>>>>>>>>>>>>> Started GET "/users/sign_up" for 91.206.178.40 at 2013-12-24
>>>>>>>>>>>>> 23:20:09 +0000
>>>>>>>>>>>>> Processing by Devise::RegistrationsController#new as HTML
>>>>>>>>>>>>>   Rendered devise/_links.erb (9758.9ms)
>>>>>>>>>>>>>   Rendered devise/registrations/new.html.erb within
>>>>>>>>>>>>> layouts/application (18132.6ms)
>>>>>>>>>>>>> Completed 200 OK in 18877ms (Views: 18569.0ms | ActiveRecord:
>>>>>>>>>>>>> 0.0ms)
>>>>>>>>>>>>> Started GET "/users/sign_in" for 91.206.178.40 at 2013-12-24
>>>>>>>>>>>>> 23:20:29 +0000
>>>>>>>>>>>>> Processing by Devise::SessionsController#new as HTML
>>>>>>>>>>>>>   Rendered devise/_links.erb (1.2ms)
>>>>>>>>>>>>>   Rendered devise/sessions/new.html.erb within
>>>>>>>>>>>>> layouts/application (102.2ms)
>>>>>>>>>>>>> Completed 200 OK in 477ms (Views: 272.0ms | ActiveRecord:
>>>>>>>>>>>>> 0.0ms)
>>>>>>>>>>>>>
>>>>>>>>>>>>> Connecting to database specified by database.yml
>>>>>>>>>>>>> Connecting to database specified by database.yml
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Sun, Dec 22, 2013 at 11:57 PM, pablo rey <
>>>>>>>>>>>>> pablo at basurama.org> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I had to restart mysql again minutes ago.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This is the rails log before crashing:
>>>>>>>>>>>>>> Started GET "/numeroteca/local-national-world-news/coding"
>>>>>>>>>>>>>> for 5.10.83.101 at 2013-12-22 19:11:02 +0000
>>>>>>>>>>>>>> Processing by CodingController#process_images as */*
>>>>>>>>>>>>>>   Parameters: {"username"=>"numeroteca",
>>>>>>>>>>>>>> "thread_name"=>"local-national-world-news"}
>>>>>>>>>>>>>>   Rendered coding/_coding_footer.html.erb (0.1ms)
>>>>>>>>>>>>>>   Rendered coding/process_images.html.erb within
>>>>>>>>>>>>>> layouts/application (3323.2ms)
>>>>>>>>>>>>>> Completed 200 OK in 3331ms (Views: 1658.8ms | ActiveRecord:
>>>>>>>>>>>>>> 1668.8ms)
>>>>>>>>>>>>>> Started GET "/" for 67.23.226.7 at 2013-12-22 19:11:10 +0000
>>>>>>>>>>>>>> Processing by HomeController#index as HTML
>>>>>>>>>>>>>>   Rendered home/index.html.erb within layouts/application
>>>>>>>>>>>>>> (184.1ms)
>>>>>>>>>>>>>> Completed 200 OK in 284ms (Views: 163.3ms | ActiveRecord:
>>>>>>>>>>>>>> 120.0ms)
>>>>>>>>>>>>>> Started GET "/assets/application.css" for 67.23.226.7 at
>>>>>>>>>>>>>> 2013-12-22 19:11:16 +0000
>>>>>>>>>>>>>> Served asset /application.css - 304 Not Modified (522ms)
>>>>>>>>>>>>>> Started GET "/Donna/juicio-marita-veron-cronica/export.ods"
>>>>>>>>>>>>>> for 5.10.83.105 at 2013-12-22 19:11:17 +0000
>>>>>>>>>>>>>> Processing by ThreadsController#export as ODS
>>>>>>>>>>>>>>   Parameters: {"username"=>"Donna",
>>>>>>>>>>>>>> "thread_name"=>"juicio-marita-veron-cronica"}
>>>>>>>>>>>>>> Sent file /tmp/export20131222-12134-1228f0a.ods (0.1ms)
>>>>>>>>>>>>>> Completed 200 OK in 167ms (ActiveRecord: 95.3ms)
>>>>>>>>>>>>>> Started GET
>>>>>>>>>>>>>> "/davidmbusto/dragado-en-bahia-blanca/export.json" for 180.76.5.75 at
>>>>>>>>>>>>>> 2013-12-22 19:11:17 +0000
>>>>>>>>>>>>>> Processing by ThreadsController#export as JSON
>>>>>>>>>>>>>>   Parameters: {"username"=>"davidmbusto",
>>>>>>>>>>>>>> "thread_name"=>"davidmbusto"}
>>>>>>>>>>>>>> Completed 200 OK in 93ms (Views: 0.2ms | ActiveRecord: 4.6ms)
>>>>>>>>>>>>>> Started GET "/threads/" for 67.23.226.7 at 2013-12-22
>>>>>>>>>>>>>> 19:11:19 +0000
>>>>>>>>>>>>>> Processing by ThreadsController#index as HTML
>>>>>>>>>>>>>> Connecting to database specified by database.yml
>>>>>>>>>>>>>> Connecting to database specified by database.yml
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> After restarting pagination in the thread list  is not
>>>>>>>>>>>>>> working again<https://github.com/numeroteca/pageonex/issues/182>.
>>>>>>>>>>>>>> I mention it as pageonex crashed after a GET "/threads/" request. What
>>>>>>>>>>>>>> could be the cause for making mysql die? Listing the threads doesn't seem
>>>>>>>>>>>>>> very problematic.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Besides, I was trying to get a sense of the usage of PageOneX
>>>>>>>>>>>>>> web site using the Piwik statistics. According to Piwik only 4 visits to
>>>>>>>>>>>>>> the site today. According to the logs many many more. I assume many people
>>>>>>>>>>>>>> have the stats blocked but the difference isway  too high... guesses?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Happy holidays!
>>>>>>>>>>>>>> p
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Sun, Dec 15, 2013 at 1:53 AM, Edward Platt <
>>>>>>>>>>>>>> elplatt at media.mit.edu> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Should be fixed now.  Mysql wasn't running, not sure why and
>>>>>>>>>>>>>>> there are no relevant log entries.  If you ever feel the need to restart
>>>>>>>>>>>>>>> mysql, please go ahead.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -Ed
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Sat, Dec 14, 2013 at 3:52 PM, pablo rey <
>>>>>>>>>>>>>>> pablo at basurama.org> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Should I just restart the mysql server?
>>>>>>>>>>>>>>>>   sudo service mysql restart
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Sat, Dec 14, 2013 at 9:09 PM, pablo rey <
>>>>>>>>>>>>>>>> pablo at basurama.org> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Rahul and Ed!The home page returns a passenger message:
>>>>>>>>>>>>>>>>> Ruby (Rack) application could not be started
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Error message: Can't connect to local MySQL server
>>>>>>>>>>>>>>>>> through socket '/var/run/mysqld/mysqld.sock' (111) (Mysql2::Error)Exception
>>>>>>>>>>>>>>>>> class: PhusionPassenger::UnknownError
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Which is the same message I was getting and reported at
>>>>>>>>>>>>>>>>> https://github.com/numeroteca/pageonex/issues/179#issuecomment-30275654(but didn't make the app crashed).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> It looks like it's been down since yesterday.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> p
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>> Pageonexdev mailing list
>>>>>>>>>>>>>>>> Pageonexdev at mit.edu
>>>>>>>>>>>>>>>> http://mailman.mit.edu/mailman/listinfo/pageonexdev
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> Edward L. Platt
>>>>>>>>>>>>>>> Civic Technology Programmer
>>>>>>>>>>>>>>> MIT Center for Civic Media
>>>>>>>>>>>>>>> E15-348
>>>>>>>>>>>>>>> http://civic.mit.edu
>>>>>>>>>>>>>>> http://elplatt.com
>>>>>>>>>>>>>>> @elplatt <http://twitter.com/elplatt>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>> Pageonexdev mailing list
>>>>>>>>>>>>>>> Pageonexdev at mit.edu
>>>>>>>>>>>>>>> http://mailman.mit.edu/mailman/listinfo/pageonexdev
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Edward L. Platt
>>>>>>>>>>>> Civic Technology Programmer
>>>>>>>>>>>> MIT Center for Civic Media
>>>>>>>>>>>> E15-348
>>>>>>>>>>>> http://civic.mit.edu
>>>>>>>>>>>> http://elplatt.com
>>>>>>>>>>>> @elplatt <http://twitter.com/elplatt>
>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> Pageonexdev mailing list
>>>>>>>>>>>> Pageonexdev at mit.edu
>>>>>>>>>>>> http://mailman.mit.edu/mailman/listinfo/pageonexdev
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Edward L. Platt
>>>>>>>>>> Civic Technology Programmer
>>>>>>>>>> MIT Center for Civic Media
>>>>>>>>>> E15-348
>>>>>>>>>> http://civic.mit.edu
>>>>>>>>>> http://elplatt.com
>>>>>>>>>> @elplatt <http://twitter.com/elplatt>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Edward L. Platt
>>>>>>>> Civic Technology Programmer
>>>>>>>> MIT Center for Civic Media
>>>>>>>> E15-348
>>>>>>>> http://civic.mit.edu
>>>>>>>> http://elplatt.com
>>>>>>>> @elplatt <http://twitter.com/elplatt>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Edward L. Platt
>>>>>> Civic Technology Programmer
>>>>>> MIT Center for Civic Media
>>>>>> E15-348
>>>>>> http://civic.mit.edu
>>>>>> http://elplatt.com
>>>>>> @elplatt <http://twitter.com/elplatt>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Pageonexdev mailing list
>>>>>> Pageonexdev at mit.edu
>>>>>> http://mailman.mit.edu/mailman/listinfo/pageonexdev
>>>>>>
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Pageonexdev mailing list
>>>>> Pageonexdev at mit.edu
>>>>> http://mailman.mit.edu/mailman/listinfo/pageonexdev
>>>>>
>>>>>
>>>>
>>>
>>
>> _______________________________________________
>> Pageonexdev mailing list
>> Pageonexdev at mit.edu
>> http://mailman.mit.edu/mailman/listinfo/pageonexdev
>>
>>
>
> _______________________________________________
> Pageonexdev mailing list
> Pageonexdev at mit.edu
> http://mailman.mit.edu/mailman/listinfo/pageonexdev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/pageonexdev/attachments/20140123/1206497c/attachment-0001.htm


More information about the Pageonexdev mailing list