IRC logs for #aegir, 2015-06-24 (GMT)

2015-06-23
2015-06-25
TimeNickMessage
[10:05:35]* Egyptian[Home] has joined #aegir
[10:17:05]* hestenet has joined #aegir
[10:18:51]* maestrojed has quit (Quit: My Mac has gone to sleep. ZZZzzz…)
[10:41:54]* maestrojed has joined #aegir
[10:47:42]* maestrojed has quit (Quit: My Mac has gone to sleep. ZZZzzz…)
[10:48:12]* maestrojed has joined #aegir
[11:53:39]* hestenet has quit ()
[12:10:06]* Egyptian[Home] has quit (Ping timeout: 272 seconds)
[12:17:06]* zigmoo has joined #aegir
[12:27:12]* gusaus has quit (Quit: gusaus)
[12:28:12]* gusaus has joined #aegir
[12:58:54]* glennpratt has joined #aegir
[13:13:58]* bgm has quit (Remote host closed the connection)
[13:14:06]* bgm has joined #aegir
[13:22:16]* bgm_ has joined #aegir
[13:24:21]* bgm has quit (Ping timeout: 246 seconds)
[13:24:55]* bgm_ is now known as bgm
[13:25:05]* bgm has quit (Changing host)
[13:25:06]* bgm has joined #aegir
[13:58:46]* johnstorey has joined #aegir
[14:08:14]* bgm has quit (Ping timeout: 252 seconds)
[14:08:55]* bgm has joined #aegir
[14:10:31]* gusaus_ has joined #aegir
[14:11:29]* gusaus has quit (Ping timeout: 264 seconds)
[14:11:30]* gusaus_ is now known as gusaus
[14:35:58]* glennpratt has quit (Remote host closed the connection)
[14:48:36]* maestrojed has quit (Read error: Connection reset by peer)
[14:55:42]* johnstorey has quit (Ping timeout: 246 seconds)
[14:57:41]* Yaazkal has quit ()
[15:00:30]* maestrojed has joined #aegir
[15:09:00]* realityloopAFK is now known as realityloop
[15:18:41]* realityloop is now known as realityloopAFK
[15:41:09]* realityloopAFK is now known as realityloop
[15:46:34]* glennpratt has joined #aegir
[15:46:39]* realityloop is now known as realityloopAFK
[15:51:37]* glennpratt has quit (Ping timeout: 276 seconds)
[15:55:32]* David_Hernandez has joined #aegir
[16:41:09]* beautifulmind has joined #aegir
[16:41:15]* beautifulmind has left #aegir ()
[17:26:39]* maestrojed has quit (Quit: My Mac has gone to sleep. ZZZzzz…)
[17:35:33]* ivanjaros has joined #aegir
[17:45:21]* zigmoo has quit (Read error: Connection reset by peer)
[17:45:51]* zigmoo has joined #aegir
[18:09:59]* ivanjaros has quit (Quit: https://drupal.org/user/135190)
[18:17:16]* gandhiano_ has joined #aegir
[18:22:31]* sdrycroft has joined #aegir
[18:23:22]* ivanjaros has joined #aegir
[19:11:34]* orangey has quit (Quit: No Ping reply in 180 seconds.)
[19:12:51]* orangey has joined #aegir
[19:15:15]* realityloopAFK is now known as realityloop
[19:19:13]* e-anima has joined #aegir
[19:48:32]* gusaus has quit (Quit: gusaus)
[20:07:33]* realityloop is now known as realityloopAFK
[20:34:09]* nicholasalipaz_ has quit (Quit: WeeChat 0.3.7)
[20:35:27]* ivanjaros has quit (Quit: https://drupal.org/user/135190)
[20:36:06]* apassi has joined #aegir
[20:37:50]* apassi has quit (Client Quit)
[21:04:27]* beautifulmind has joined #aegir
[21:15:06]* ivanjaros has joined #aegir
[21:30:13]* gandhiano_ has quit (Ping timeout: 250 seconds)
[21:31:33]* joestewart has quit (Ping timeout: 252 seconds)
[21:35:38]* joestewart has joined #aegir
[22:26:41]* zombiebeard has joined #aegir
[22:30:53]* ivanjaros has quit (Quit: https://drupal.org/user/135190)
[22:35:07]* zz_drakythe is now known as drakythe
[22:54:22]* mstenta has joined #aegir
[23:13:47]* beautifulmind has quit (Quit: Leaving.)
[23:42:40]* glennpratt has joined #aegir
[00:13:20]* fatguylaughing has joined #aegir
[00:20:54]<ergonlogic>bgm: still need help?
[00:21:53]<bgm>ergonlogic: yep :)
[00:22:09]<bgm>ergonlogic: got a min? I got most of the issues resolved, but I think I'm missing something obvious
[00:22:12]<ergonlogic>sure
[00:22:33]<bgm>If I run this as the Aegir user, everything works fine: drush '@hostmaster' hosting-civicrm_cron --items=5 --debug --strict=0
[00:22:41]<bgm>but otherwise, the crons do not run
[00:22:46]<ergonlogic>I haven't looked at this code for awhile mind you...
[00:22:59]<ergonlogic>that's normal
[00:23:09]<ergonlogic>oh wait
[00:23:23]<ergonlogic>so, I mean that you can't run it as another user
[00:23:29]<ergonlogic>since it's using an alias
[00:23:35]<ergonlogic>that lives in ~/.drush
[00:23:40]<ergonlogic>obviously
[00:24:02]<ergonlogic>you have a crontab for the 'aegir' user?
[00:24:06]<bgm>well, I mean that I used that command to debug what was not working, and fixed a few bugs that way
[00:24:15]<bgm>otherwise, I'm running the qeueue-runner
[00:24:31]<ergonlogic>yeah, the queued is onloy for the task queue
[00:24:36]<ergonlogic>it doesn't run other queues
[00:24:37]<bgm>*/1 * * * * /usr/bin/env php /var/aegir/.composer/vendor/drush/drush/drush.php '@hostmaster' hosting-dispatch
[00:24:42]<bgm>my crontab for aegir has this ^
[00:24:43]<ergonlogic>for that, you still need a crontab
[00:24:51]<ergonlogic>ok, that should do it
[00:24:59]<bgm>the drupal crons seem to run OK
[00:25:12]<ergonlogic>ok, so the queue system looks like it's ok
[00:25:32]<ergonlogic>so, it's something specific to the civi cron queue, presumably
[00:26:06]<ergonlogic>what is you run hosting-dispatch manually?
[00:26:25]<ergonlogic>set the queue interval down to 1 second
[00:26:30]<ergonlogic>if you haven't already
[00:26:44]<ergonlogic>to ensure it gets called whenever we dispatch
[00:27:34]<bgm>1 min, checking the output
[00:29:18]<bgm>ergonlogic: hmm, odd, i do see things like this: http://paste.debian.net/hidden/4e705e37/
[00:30:02]<ergonlogic>hmm
[00:30:14]<ergonlogic>I have to go for about 30 mins
[00:30:15]<ergonlogic>aorry
[00:30:17]<ergonlogic>sorry
[00:30:25]<ergonlogic>can we take this up as soon as I'm back?
[00:30:25]<bgm>np, thanks for the pointers, i'll continue digging
[00:30:36]<bgm>sure, thanks again :)
[00:40:34]* sdrycroft has quit (Quit: Leaving.)
[01:04:01]* DecipheredAFK has quit (Ping timeout: 244 seconds)
[01:06:12]* DecipheredAFK has joined #aegir
[01:10:44]* DecipheredAFK has quit (Ping timeout: 244 seconds)
[01:13:10]* DecipheredAFK has joined #aegir
[01:19:19]<ergonlogic>bgm: back
[01:19:24]<ergonlogic>any progress?
[01:20:39]<bgm>ergonlogic: i think i'm misunderstanding how the queue system works for cron
[01:21:11]<ergonlogic>ok
[01:21:46]<ergonlogic>so, for context, this is basically upgrading hosting_civicrm to Aegir3?
[01:21:47]<bgm>i'm not too sure how to formulate my questions :)
[01:21:54]<bgm>ergonlogic: yep
[01:22:00]<ergonlogic>we can work through it
[01:22:27]* DecipheredAFK has quit (Quit: ZNC - http://znc.in)
[01:22:37]<bgm>my civi cron is set to run every 15 mins, but if I don't start the hosting-dispatch manually, it doesn't seem to run
[01:22:45]<bgm>ok, facepalm
[01:22:51]<bgm>i guess i had to say it out loud :)
[01:23:14]* DecipheredAFK has joined #aegir
[01:23:20]<bgm>the crontab call is suspicious
[01:23:22]<ergonlogic>ok...
[01:23:41]<ergonlogic>is it just that civi was ignoring crons because it was too soon?
[01:24:11]<bgm>when i test, i run:
[01:24:12]<bgm>$ drush @hm vset hosting_queue_civicrm_cron_last_run 1235156009
[01:24:17]<bgm>$ drush @hm vset hosting_queue_cron_last_run 1235156009
[01:24:37]<bgm>(to avoid the "being called too early")
[01:25:21]<bgm>and hmm, i guess my cron seems ok, and the core-cron mostly works
[01:25:35]<ergonlogic>well, those'll tell Aegir to go ahead with calling cron
[01:25:43]<ergonlogic>but doesn't civi have it's own timeout?
[01:26:04]<bgm>jobs will have their own configs, yep
[01:26:19]<bgm>but i'm mostly checking the "civicrm cron run" (time) displayed in the front-end
[01:26:26]<ergonlogic>iirc, you can set that to run whenever it's called, no?
[01:26:32]<bgm>yep
[01:26:55]<ergonlogic>so... is it that civi's cron isn't running, or that the time s wrong?
[01:27:18]<ergonlogic>have you tried calling 'drush hosting-civicrm_cron' directly?
[01:27:48]<bgm>yep, that works
[01:28:30]<ergonlogic>ok... so...
[01:28:42]<ergonlogic>I'm not sure I understand where the problem is then
[01:29:11]<bgm>probably i've been staring at the problem too much
[01:29:17]<bgm>but it seems to me like the crons are called kind of randomly
[01:29:38]<bgm>i worked on this last night, and this morning, neither cron had been called (drupal or civicrm)
[01:29:51]<bgm>i ran hosting-dispatch manually, and at some point the civicrm cron ran
[01:30:08]<bgm>even though they're configured to run more regularly (drupal = 1h, civi = 15 mins)
[01:30:26]<ergonlogic>are you confirming the last crons on the drupal and civi directly?
[01:30:34]<bgm>now i've been starting hosting-tasks a dozen times today, and the drupal cron didn't seem to be running, but now it just did
[01:30:53]<ergonlogic>because the problem could be w/ registering the time it ran
[01:31:18]<bgm>hmm, i haven't been checking systematically
[01:31:25]<bgm>i guess that's highly probably
[01:31:27]<bgm>probable*
[01:32:08]* realityloopAFK is now known as realityloop
[01:32:21]<bgm>I guess a civi 'job' that runs only now and then, say, every hour, could be causing that problem
[01:34:05]<bgm>oh erm .. :-)
[01:34:08]<bgm>yeah.. probably
[01:34:21]<bgm>I guess it's triggering an exception, and it's not caught
[01:34:42]* DecipherL has joined #aegir
[01:35:24]<ergonlogic>if the cron is long-running, it's also possible that th drush call times out, or something
[01:36:21]<ergonlogic>I recently implemented the queue locking to avoid having a long-running task pass the timeout, and end up getting called again
[01:36:34]* DecipheredAFK has quit (Ping timeout: 244 seconds)
[01:36:50]<ergonlogic>could it be that the lock isn't being released?
[01:37:26]<ergonlogic>I don't recall the default expiry I set...
[01:37:28]<bgm>ergonlogic: cool, I guess that rules out the code issues i was having earlier, and probably just a broken cron task. I'll check locking as well
[01:38:00]<ergonlogic>let me see if I can track down where that locking code is implemented
[01:38:22]<bgm>locks are in the `semaphore` table?
[01:38:44]* ivanjaros has joined #aegir
[01:42:55]<bgm>i'm still a bit suspicious about the crontab config
[01:43:06]<bgm>this is an old wheezy server, so it's harder to debug..
[01:43:30]<ergonlogic>It's around here: http://cgit.drupalcode.org/hosting/tree/dispatch.hosting.inc#n26
[01:43:48]<ergonlogic>yes, semaphore
[01:44:17]<ergonlogic>it's per queue
[01:44:29]<ergonlogic>$semaphore = "hosting_dispatch_{$queue}_running";
[01:45:02]<bgm>hmm, i think the aegir user is .. partially deleted
[01:45:11]<bgm>it's in passwd, but not shadow
[01:45:42]<bgm>every minute, i see "CRON[10796]: Authentication failure" in the syslog
[01:45:58]<bgm>otherwise cron would show the command being executed?
[01:46:05]<ergonlogic>as for crontab...
[01:46:48]<ergonlogic>"*/1 * * * *" should be equivalent to "* * * * *"
[01:47:17]<ergonlogic>but apparently some OSes don't support the */n format
[01:47:26]<ergonlogic>wheezy should, but meh
[01:47:30]<ergonlogic>something to consider
[01:47:31]<bgm>yikes, that was it
[01:47:41]<bgm>(#@! sorry.. wild goose chase)
[01:47:52]<bgm>it must have been from when I removed the aegir2 debian package
[01:47:59]<ergonlogic>ah, ok
[01:48:43]<bgm>i had this on a few servers, and restored the aegir user afterwards.. well, the passwd and shadow, but must have forgotten to restore the shadow file in this case
[01:48:54]<ergonlogic>that's pretty messed up :)
[01:48:54]<bgm>(i know, not the best way to restore users, but i didn't want to mess with user IDs)
[01:49:11]<bgm>incidently, this was my only server without logcheck running :)
[01:49:20]<ergonlogic>of course :)
[01:49:27]<bgm>ergonlogic: many thanks, i was going mad :-)
[01:49:42]<ergonlogic>np
[01:49:51]<ergonlogic>sometimes you just need a rubber duck
[01:49:59]<bgm>this is me: http://devopsreactions.tumblr.com/post/122324765782/testing-of-a-brand-n...
[01:50:03]<bgm>;-)
[01:50:17]<ergonlogic>https://en.wikipedia.org/wiki/Rubber_duck_debugging
[01:50:24]<bgm>ergonlogic: well, i did have a few bugs in the hosting_civicrm_cron code
[01:50:46]<bgm>yeah, very familiar with that :)
[01:51:32]<ergonlogic>probably from my half-assed implementation from way back
[01:51:58]<bgm>ah, nah, it was a d7 port error i did, with db_query()
[01:52:17]<ergonlogic>well, I'm glad I could "help" :p
[01:52:19]<bgm>drupal's db layer really is the extreme opposite of civicrm :)
[01:52:35]<ergonlogic>oh?
[01:52:37]<bgm>civi does a fatal error at the slightest suspicious sign.. drupal just lets it go
[01:53:18]<ergonlogic>yeah, I prefer to fail hard and early
[01:53:57]<bgm>yep, well.. drupal was initially for content, where any content is better than no content, but aegir/civi are transactionnal
[01:53:57]<ergonlogic>I ran into some of that leniency in a new D8 project I'm working on
[01:55:06]<ergonlogic>I'm working on a prototype to replace our queueing system, actually
[01:55:22]<bgm>cool :)
[01:55:44]<ergonlogic>the idea is to pop a task into a celery/rabbitmq queue, then have it run by a worker process
[01:55:52]<ergonlogic>async from drupal, obviously
[01:56:04]<ergonlogic>and then post output back via a little rest api endpoint
[01:56:50]<ergonlogic>if you try to write to the db with an incorrect data type, it'll throw an error in the log, but otherwise just ignore it
[01:57:06]<ergonlogic>which was a bit confusing at first
[01:57:37]<ergonlogic>for testing, I was setting a 418 response code
[01:57:50]<ergonlogic>b/c I like seeing a computer think it's a teapot :)
[01:58:11]<bgm>haha :)
[01:58:13]<ergonlogic>anyway, it ended up responding 200 if it failed
[01:58:38]<ergonlogic>just strange behaviour, imo
[01:59:19]<bgm>many rest apis will respond their actual response code in a result field, rather than the http code
[02:00:17]<bgm>(i get annoyed when apis respond 200 Ok, even when an error happened, but i guess that's partly why)
[02:00:27]<ergonlogic>I suppose
[02:00:46]<bgm>ergonlogic: hey, quick unrelated question: i noticed that aegir does not enable the 'update' module by default
[02:01:01]<bgm>but it's kind of a neat reminder to upgrade, but also to get hosting module stats
[02:01:27]<bgm>was it because install profiles have issues with drupal upgrades?
[02:01:27]<ergonlogic>iirc, it wasn't getting to the reponse object I was returning... so I guess Drupal was taking over and responding w/200 on my code's behalf, despite it failing
[02:01:31]<ergonlogic>blech
[02:01:41]<bgm>:)
[02:01:52]<ergonlogic>in aegir3?
[02:02:00]<bgm>yep
[02:02:03]<ergonlogic>I thought d7 enabled it by default, no?
[02:02:11]<ergonlogic>in the std profile?
[02:02:22]<bgm>hmm ok, maybe i should re-check
[02:02:23]<ergonlogic>I think we'd intended to enable it in aegir3
[02:02:28]<bgm>since that was a server that was upgraded
[02:02:31]<bgm>ok
[02:02:38]<ergonlogic>yeah, that might be it
[02:02:51]<bgm>i was checking the stats for https://www.drupal.org/project/hosting_civicrm which said 0 installs :)
[02:02:54]<ergonlogic>could you doublecheck that, and post an issue, if you see it isn't?
[02:02:55]* DecipherL has quit (Ping timeout: 244 seconds)
[02:02:58]<bgm>sure
[02:03:01]<bgm>thanks
[02:03:01]* ivanjaros has quit (Quit: https://drupal.org/user/135190)
[02:03:13]<ergonlogic>yeah, it'd definitely be good to get that feedback
[02:03:25]<bgm>i have to afk a while, but thanks again for the support :)
[02:03:51]<ergonlogic>also, related to updates, cameron recently posted a new hosting module to track update status on hosted sites via aegir
[02:04:04]<ergonlogic>haven't tried it yet
[02:04:15]* DecipheredAFK has joined #aegir
[02:04:22]<ergonlogic>but definitely valuable
[02:04:55]<bgm>cool
[02:05:03]<ergonlogic>anyway, got to get back to work. later :)
[02:09:54]* David_Hernandez has quit (Quit: :wq!)
[02:13:18]* DecipheredAFK has quit (Quit: ZNC - http://znc.in)
[02:13:51]* DecipheredAFK has joined #aegir
[02:18:25]* DecipheredAFK has quit (Ping timeout: 244 seconds)
[02:19:54]* maestrojed has joined #aegir
[02:20:17]* DecipheredAFK has joined #aegir
[02:20:37]* realityloop is now known as realityloopAFK
[02:21:31]* maestrojed has quit (Client Quit)
[02:29:20]* DecipheredAFK has quit (Quit: ZNC - http://znc.in)
[02:30:18]* DecipheredAFK has joined #aegir
[02:39:20]* DecipheredAFK has quit (Quit: ZNC - http://znc.in)
[02:40:46]* DecipheredAFK has joined #aegir
[02:49:47]* DecipheredAFK has quit (Quit: ZNC - http://znc.in)
[02:50:53]* DecipheredAFK has joined #aegir
[02:58:32]* gandhiano_ has joined #aegir
[02:59:56]* mstenta has quit (Ping timeout: 272 seconds)
[03:03:07]* maestrojed has joined #aegir
[03:15:15]* DecipheredAFK has quit (Ping timeout: 244 seconds)
[03:18:30]* DecipheredAFK has joined #aegir
[03:23:00]* DecipheredAFK has quit (Ping timeout: 244 seconds)
[03:25:01]* DecipheredAFK has joined #aegir
[03:31:42]* gandhiano_ has quit (Ping timeout: 246 seconds)
[03:35:08]* gusaus has joined #aegir
[04:51:14]* ivanjaros has joined #aegir
[05:06:51]* DecipheredAFK has quit (Ping timeout: 244 seconds)
[05:10:03]* DecipheredAFK has joined #aegir
[05:14:38]* DecipheredAFK has quit (Ping timeout: 252 seconds)
[05:15:33]* DecipheredAFK has joined #aegir
[06:00:01]* mstenta has joined #aegir
[06:03:11]* realityloopAFK is now known as realityloop
[06:18:43]* mstenta has quit (Ping timeout: 255 seconds)
[06:20:10]* glennpra_ has joined #aegir
[06:24:10]* glennpratt has quit (Ping timeout: 256 seconds)
[06:36:20]* gandhiano_ has joined #aegir
[07:06:52]* glennpratt has joined #aegir
[07:10:43]* glennpra_ has quit (Ping timeout: 276 seconds)
[07:38:15]* Egyptian[Home] has joined #aegir
[08:01:49]* zombiebeard has quit (Quit: zombiebeard)
[08:01:54]* Egyptian[Home] has quit (Ping timeout: 246 seconds)
[08:07:21]* Yaazkal has joined #aegir
[08:15:13]* ivanjaros has quit (Quit: https://drupal.org/user/135190)
[08:23:57]* e-anima has quit (Quit: reallife not found)
[08:25:09]* fatguylaughing has quit (Quit: fatguylaughing)
[08:41:27]* gandhiano_ has quit (Ping timeout: 246 seconds)
[09:04:17]* drakythe is now known as zz_drakythe
[09:04:35]* Egyptian[Home] has joined #aegir
[09:04:48]* realityloop is now known as realityloopAFK
[09:21:23]* Egyptian[Home] has quit (Ping timeout: 246 seconds)
[09:28:36]* Egyptian[Home] has joined #aegir
[09:33:14]* Egyptian[Home] has quit (Ping timeout: 272 seconds)
[09:34:56]* fatguylaughing has joined #aegir