| [11:02:23] | * theMusician has joined #aegir |
| [11:14:59] | * realityloop has quit (Quit: Leaving..) |
| [11:15:16] | * realityloop has joined #aegir |
| [11:20:43] | * tommycox has joined #aegir |
| [11:32:40] | * realityloop has quit (Ping timeout: 240 seconds) |
| [11:37:37] | * realityloop has joined #aegir |
| [11:42:59] | * mengi has quit (Quit: Leaving.) |
| [11:52:59] | * theMusician has quit (Quit: theMusician) |
| [13:09:30] | * v20th has quit (Quit: Leaving) |
| [13:32:42] | * tommycox has quit (Ping timeout: 256 seconds) |
| [13:33:23] | * tommycox has joined #aegir |
| [13:34:21] | * tommycox_ has joined #aegir |
| [13:38:05] | * tommycox has quit (Ping timeout: 240 seconds) |
| [14:22:41] | * tommycox_ has quit (Ping timeout: 248 seconds) |
| [14:24:21] | * tommycox has joined #aegir |
| [16:38:25] | * realityloop has quit (Quit: Leaving..) |
| [17:12:22] | * hefring has quit (Ping timeout: 264 seconds) |
| [17:12:34] | * hefring has joined #aegir |
| [18:19:48] | * boshtian has joined #aegir |
| [18:22:24] | * tommycox has quit (Remote host closed the connection) |
| [18:34:57] | * reaper013 has joined #aegir |
| [18:36:38] | * tommycox has joined #aegir |
| [18:37:21] | * tommycox_ has joined #aegir |
| [18:41:09] | * tommycox has quit (Ping timeout: 240 seconds) |
| [19:19:20] | * reaper013 has quit (Ping timeout: 260 seconds) |
| [19:31:56] | * oluabbeys has joined #aegir |
| [20:02:19] | * fatguylaughing has joined #aegir |
| [20:07:10] | * fatguylaughing has quit (Ping timeout: 240 seconds) |
| [20:19:53] | * tommycox_ has quit (Remote host closed the connection) |
| [20:20:53] | * tommycox has joined #aegir |
| [20:48:35] | * tommycox has quit () |
| [20:57:56] | * cmcintosh has joined #aegir |
| [22:49:23] | * boshtian has quit (Quit: boshtian) |
| [23:47:13] | * reaper013 has joined #aegir |
| [23:58:10] | * oluabbeys has quit (Ping timeout: 240 seconds) |
| [00:20:43] | * v20th has joined #aegir |
| [00:20:44] | * oluabbeys has joined #aegir |
| [00:29:36] | * spyd has joined #aegir |
| [00:39:34] | * ybabel has joined #aegir |
| [01:36:51] | * spyd has quit (Quit: Lost terminal) |
| [01:39:45] | * cmcintosh has quit (Quit: Leaving...) |
| [01:40:26] | * spyd has joined #aegir |
| [01:46:08] | * fatguylaughing has joined #aegir |
| [01:49:38] | * fatguylaughing has quit (Client Quit) |
| [02:02:58] | * fatguylaughing has joined #aegir |
| [02:06:06] | * spyd has quit (Quit: leaving) |
| [02:12:56] | * spyd has joined #aegir |
| [03:07:26] | * spyd has quit (Quit: Lost terminal) |
| [03:13:09] | * theMusician has joined #aegir |
| [03:29:25] | * boshtian has joined #aegir |
| [03:30:10] | * oluabbeys has quit (Ping timeout: 240 seconds) |
| [03:33:36] | * boshtian has quit (Ping timeout: 248 seconds) |
| [03:38:25] | * reaper013 has quit (Quit: Page closed) |
| [03:51:49] | * roycroft has joined #aegir |
| [03:51:53] | <roycroft> | hello |
| [03:52:05] | <roycroft> | i have a site that's been compromised, and i need to shut it down asap |
| [03:52:22] | <roycroft> | the load is so high on that machine that i can't use gui to manage things |
| [03:52:34] | <roycroft> | does anyone know a way to disable a site via drush? |
| [04:01:34] | <ergonlogic> | roycroft: "drush provision-disable"? |
| [04:03:06] | <roycroft> | well that did not work out |
| [04:03:14] | <roycroft> | the load is too high and drush refuses to run |
| [04:03:23] | <roycroft> | can that be overridden? |
| [04:03:41] | <ergonlogic> | yes, hang on |
| [04:03:57] | <roycroft> | thanks |
| [04:05:16] | <ergonlogic> | try `drush @site-name provision-disable --strict=0 --critical_load_threshold=1000` |
| [04:06:41] | <roycroft> | nope |
| [04:06:45] | <roycroft> | still aborts |
| [04:06:49] | <roycroft> | i can kill apache though |
| [04:06:55] | <roycroft> | and wait for the load to go down |
| [04:07:05] | <ergonlogic> | try "--critical_load_multiplier" instead |
| [04:07:57] | <roycroft> | that's not an option |
| [04:08:01] | <roycroft> | this is an older version of drush |
| [04:08:12] | * mengi has joined #aegir |
| [04:08:25] | <ergonlogic> | you need "--strict=0" |
| [04:08:28] | <roycroft> | i have the load down now |
| [04:08:38] | <roycroft> | and drush is not aborting |
| [04:08:43] | <roycroft> | so maybe it will disable the site |
| [04:09:40] | <roycroft> | slowly it is doing the job |
| [04:09:58] | <roycroft> | yes, it is done |
| [04:10:00] | <roycroft> | thanks! |
| [04:10:07] | <roycroft> | and apache restarted upon completion |
| [04:10:23] | <roycroft> | and the load did not spike to 45 within seconds |
| [04:10:23] | <ergonlogic> | fwiw, we default to a load of 5x the number of CPUs, and fall back to load of 10, if we can't figure out how many CPUs there are |
| [04:10:41] | <roycroft> | yes, i should probably increase that |
| [04:10:47] | <roycroft> | on the newer machines |
| [04:10:56] | <roycroft> | this particular machine is our first aegir deployment |
| [04:10:56] | * ybabel has quit (Ping timeout: 252 seconds) |
| [04:11:02] | <roycroft> | and we have some older drupal sites on it |
| [04:11:19] | <roycroft> | my boss was supposed to migrate them to other machines a long time ago, but he can't be bothered to do that kind of stuff |
| [04:11:45] | <roycroft> | this particular website is one of his pet projects, so perhaps my disabling it will motivate him to get going on the migrations |
| [04:12:23] | <roycroft> | i pointed dns for the domain to localhost as well, so that we'll stop taking the hits when $TTL expires |
| [04:12:42] | <roycroft> | but yeah load is down to 2.1 |
| [04:12:56] | <ergonlogic> | so, to increase that threshold, you should be able to set `critical_load_multiplier` in /var/aegir/.drush/drushrc.php |
| [04:13:20] | <roycroft> | and as i said, it was spiking up to 45 within seconds of restarting apache, and would climb into triple digits shortly thereafter |
| [04:13:25] | <ergonlogic> | hmm, actually, that'll be overwritten |
| [04:16:43] | <roycroft> | i'll look into how to set the load threshhold higher once i come up for air, unless you think of how to do it before then |
| [04:16:48] | <roycroft> | thanks for helping me get it shut down |
| [04:17:04] | <ergonlogic> | sorry, it'd go in `/var/aegir/.drush/local.drushrc.php` |
| [04:17:07] | <roycroft> | i don't care about that particular site - while it's a pet project of my boss, it's generally very low traffic |
| [04:17:16] | <roycroft> | but it's affecting some high traffic sites |
| [04:17:24] | <roycroft> | i think i'm going to spin this site off on its own vm |
| [04:17:31] | <ergonlogic> | understood |
| [04:18:23] | <roycroft> | that file does not exist |
| [04:18:31] | <ergonlogic> | so, it should just be `$options['critical_load_multiplier'] = 10;` in /var/aegir/.drush/local.drushrc.php |
| [04:18:34] | <roycroft> | is that something that gets applied to every site? |
| [04:18:38] | <ergonlogic> | no, not by default |
| [04:18:52] | <roycroft> | i seem to remember having to create that before for something |
| [04:18:58] | <roycroft> | btw, i'm a network admin/sysadmin |
| [04:18:59] | <ergonlogic> | but you should see an include for it at the very bottom of /var/aegir/.drush/drushrc.php |
| [04:19:03] | <roycroft> | not a content developer |
| [04:19:08] | <roycroft> | and i know almost nothing about aegir/drush |
| [04:19:18] | <roycroft> | but i'm the one who has to clean up the messes :) |
| [04:19:27] | <ergonlogic> | I feel your pain :) |
| [04:20:07] | <roycroft> | so when i create that file, do i have to regenerate the sites for it to be applied? |
| [04:20:11] | <ergonlogic> | so, that file isn't managed by Aegir, whereas the one that includes it get's re-written whenever the Aegir site itself is verified |
| [04:20:19] | <roycroft> | or does drush look for it every time i try to invoke it? |
| [04:20:21] | <ergonlogic> | nope |
| [04:20:24] | <roycroft> | ok |
| [04:20:37] | <ergonlogic> | it'll be included in any drush command run by the 'aegir' user |
| [04:20:58] | <ergonlogic> | it shouldn't affect the hosted sites through |
| [04:21:54] | <ergonlogic> | on a site-by-site bases, you could created `local.settings.php` files, if you wanted to inject variables or other config at the Drupal level |
| [04:22:40] | <ergonlogic> | and there are a number of other methods for altering behaviour: http://docs.aegirproject.org/en/3.x/extend/altering-behaviours/ |
| [04:22:51] | <roycroft> | ok |
| [04:23:09] | <roycroft> | i created the local.drush.php file as you suggested for now on that machine |
| [04:23:32] | <roycroft> | and i can do a drush vget, which runs correctly |
| [04:23:42] | <roycroft> | i wanted to make sure i didn't have some syntax error that would cause drush to barf |
| [04:24:03] | <ergonlogic> | prudent |
| [04:24:08] | <roycroft> | so i should be able to run drush at higher loads now |
| [04:24:25] | <roycroft> | my next step is to get my boss interested in how the site got exploited :) |
| [04:25:34] | <ergonlogic> | so, fyi, we set that threshold generally, so that serving sites doesn't get bogged down because someone triggered a backup, or the like |
| [04:26:21] | <colan> | as we've got some projects on there: https://about.gitlab.com/2017/02/01/gitlab-dot-com-database-incident/ |
| [04:27:19] | <roycroft> | yes, that's a great idea |
| [04:27:21] | <ergonlogic> | we also run drush command that are triggered from the backend (via the queue runner) both niced and ioniced, to deprioritize them vs. web traffic |
| [04:27:43] | <ergonlogic> | but manually run drush commands won't automatically inherit that |
| [04:27:43] | <roycroft> | whatever the default threshhold is, we have rarely surpassed it in the past |
| [04:27:52] | <roycroft> | but occasionally i think we did |
| [04:28:03] | <roycroft> | that setting should ameliorate the effects of load spikes |
| [04:28:26] | <ergonlogic> | yep, just wanted to make sure you were aware of the possible consequences |
| [04:28:33] | <roycroft> | i do appreciate it |
| [04:28:53] | <roycroft> | can you tell me now how to get the boss' kid to stop doing backups of huge websites into /tmp and filling it up? :) |
| [04:29:07] | <roycroft> | i'm thinking rmuser |
| [04:29:12] | <roycroft> | but that would not go over well |
| [04:29:40] | <roycroft> | i created a 10GB /tmp a while ago |
| [04:29:46] | <roycroft> | but he still fills it up |
| [04:30:06] | <colan> | roycroft: you can have that cleared more often, not just at reboots. might help. |
| [04:30:22] | <roycroft> | yes, that's what i'll probably do |
| [04:30:30] | <ergonlogic> | colan: nice transparency on gitlab.com's part. But a shame they lost the data. Did it hurt any of our projects? |
| [04:30:34] | <roycroft> | i thought a cron job to clean /tmp every 12 hours would help |
| [04:30:46] | <roycroft> | but he does some really huge backups at times |
| [04:30:54] | <roycroft> | he just needs to stick them somewhere else |
| [04:30:58] | <colan> | right, or daily, etc. would be good. |
| [04:31:08] | <roycroft> | or maybe i need to install a 1TB disk for /tmp |
| [04:31:16] | <roycroft> | that is a very ugly solution though |
| [04:31:39] | <roycroft> | oh well |
| [04:31:50] | <ergonlogic> | I've set up some Aegir servers on AWS, where we have backups shipped off to S3 |
| [04:31:55] | <colan> | ergonlogic: Of the stuff i'm subscribed to (which I think is everything), I didn't get any notifications during that window so we should be good. |
| [04:32:02] | <ergonlogic> | and thus removed from local storage |
| [04:32:07] | <roycroft> | it's not as bad to fill up a /tmp filesystem as to fill up / when /tmp is part of / |
| [04:32:17] | <colan> | heh, yeah. |
| [04:32:39] | <roycroft> | but this is not really a technical problem - it's a social problem |
| [04:33:53] | <ergonlogic> | ouch, part of the gitlab.com failure was "schrodinger's backup" problem |
| [04:34:16] | <ergonlogic> | they appeared to have automated backups that were mostly empty |
| [04:34:30] | <roycroft> | i am weird |
| [04:34:42] | <roycroft> | i do test restores from my backups on a regular basis, just to make sure they work |
| [04:35:16] | <ergonlogic> | that's a good idea. if you don;t test your backups, eventually you'll get bitten by it |
| [04:35:24] | <roycroft> | i am pretty old school |
| [04:35:39] | <ergonlogic> | in Aegir, we use backups every time we clone or migrate a site |
| [04:35:40] | <roycroft> | i started using/admining unix systems in the late '70s |
| [04:35:51] | <ergonlogic> | so we test them in the normal course of operations |
| [04:36:09] | <roycroft> | even though 9 track tapes were pretty reliable, it was important to test them |
| [04:36:30] | <ergonlogic> | wow, well you have me beat by several decades :) |
| [04:36:31] | <roycroft> | they were a hell of a lot more reliable than dat/exabyte |
| [04:37:09] | <roycroft> | those were the days |
| [04:37:13] | <roycroft> | pre-spam |
| [04:37:36] | <ergonlogic> | again, from gitlab.com's post-mortem: "So in other words, out of five backup/replication techniques deployed none are working reliably or set up in the first place. " |
| [04:37:54] | <roycroft> | i use amanda for backups |
| [04:38:11] | <roycroft> | and although i manually do the test restores, i could script them i'm sure |
| [04:38:36] | <roycroft> | but it's not a big deal to take 20 minutes once every couple weeks and restore something just to be sure it works |
| [04:38:44] | <roycroft> | the daily backup reports are also good indicators |
| [04:38:53] | <roycroft> | which i actually read |
| [04:39:19] | <roycroft> | i also have nagios do queries on each database on a regular basis |
| [04:39:47] | <roycroft> | i've seen folks back up databases that get corrupt |
| [04:40:04] | <ergonlogic> | one of the challenges with how Aegir relies on backups is that we have longer downtime for upgrades (migrations) than we really need. |
| [04:40:05] | <roycroft> | and then run through their whole tape cycle without being aware that they're backing up corrupt databases |
| [04:40:51] | <roycroft> | aegir developers might not like this, but i discourage using aegir for backing up the drupal sites |
| [04:40:55] | <roycroft> | for upgrades it needs to be done |
| [04:41:02] | <roycroft> | but i prefer to do my own filesystem backups |
| [04:41:07] | <roycroft> | for the day-to-day stuff |
| [04:41:22] | <ergonlogic> | we had a bug that was intermittently blocking clones at one point, and it turned out to be corrupt db dumps |
| [04:41:32] | <roycroft> | that's not good |
| [04:41:38] | <ergonlogic> | since fixed, and throws a proper error now, if such a thing recurs |
| [04:42:01] | <ergonlogic> | we had been piping the results of the dump into gzip, which was stifling the errors |
| [04:42:03] | <roycroft> | i put my databases on a filesystem that does not get backuped up |
| [04:42:12] | <roycroft> | but i do db dumps every night to a filesystem that does get backed up |
| [04:42:32] | <ergonlogic> | well, aegir backups are good, in that you get a snapshot of a site's entire state |
| [04:42:38] | <roycroft> | backing up live databases doesn't work out well if one is trying to do filesystem dumps |
| [04:43:02] | <roycroft> | yes, i'm not suggesting that aegir backups are problematic per se |
| [04:43:17] | <roycroft> | honestly, the main problem i have with them is that i don't control them |
| [04:43:26] | <ergonlogic> | but separate FS and db backups are worthwhile anyway, I agree |
| [04:43:28] | <roycroft> | my boss and his kid, who does most of the website development, do |
| [04:43:37] | <roycroft> | and they don't do anything consistently |
| [04:43:47] | <roycroft> | if i left it up to them i'd have no idea what is being backed up and what isn't |
| [04:43:59] | <ergonlogic> | there's a feature in Aegir to schedule nightly site backups, btw |
| [04:44:04] | <roycroft> | so the problem is not with aegir |
| [04:44:09] | <roycroft> | it's a social problem again |
| [04:44:17] | <roycroft> | yes, i know there is |
| [04:44:29] | <ergonlogic> | yeah, I hear you |
| [04:44:41] | <roycroft> | but that can be disabled without my knowledge |
| [04:45:10] | <roycroft> | another problem i deal with is that i back up offsite |
| [04:45:22] | <roycroft> | and backups are taking about 8.5 hours on average now |
| [04:45:30] | <roycroft> | i'd like to keep them to 6 hours |
| [04:45:38] | <roycroft> | midnight-6am local time |
| [04:45:54] | <roycroft> | when my boss or his kid start doing aegir backups and i'm doing fs and db backups |
| [04:46:00] | <roycroft> | it duplicates a lot of the data |
| [04:46:07] | <roycroft> | and makes the backups take longer |
| [04:46:10] | <ergonlogic> | from Douglas Adams: "... a common mistake that people make when trying to design something completely foolproof is to underestimate the ingenuity of complete fools." |
| [04:46:48] | <roycroft> | that's similar to my first rule of system administration: "never overestimate the intelligence of the end user" |
| [04:47:44] | <roycroft> | ok, off to other fires |
| [04:47:49] | <roycroft> | thank you again! |
| [04:47:57] | <ergonlogic> | np. any time :) |
| [04:48:01] | <roycroft> | it's refreshing to get good help without attitude |
| [04:48:05] | <roycroft> | unusual for the irc :) |
| [04:48:18] | * roycroft has left #aegir () |
| [05:09:57] | * spyd has joined #aegir |
| [05:54:31] | * oluabbeys has joined #aegir |
| [05:55:33] | * oluabbeys has quit (Client Quit) |
| [06:01:44] | * gusaus has joined #aegir |
| [06:04:21] | <helmo> | Hi all, ergonlog1c bgm jonpugh colan cweagans gboudrias memtkmcc .. It's Scrum time. |
| [06:05:14] | <helmo> | ergonlogic: Any idea why in our objects $application_name is protected ? https://www.drupal.org/node/2812853 |
| [06:05:16] | <hefring> | https://www.drupal.org/node/2812853 => whitelist IP's for basic auth [#2812853] => 9 comments, 1 IRC mention |
| [06:05:54] | <helmo> | I'd like to add a quick condition to the UI of this feature as it's apache only at the moment. |
| [06:09:06] | <helmo> | About my week, I deployed hosting_https to a live master slave server ... after some tuning :) ... So I think we're nearing beta https://gitlab.com/aegir/hosting_https/issues/29 |
| [06:10:23] | <helmo> | An important thing to keep track of is Zeus ... Zeus is going down (in April)... Where do we move our package repo / jenkins ? |
| [06:11:17] | <ergonlogic> | I'd suggest we consider the Suse packaging service again |
| [06:11:26] | <helmo> | We have https://www.drupal.org/node/2817199 open but no tracktion ... the 'easy/lazy' way would be to just move it to a new vm |
| [06:11:27] | <hefring> | https://www.drupal.org/node/2817199 => Alternative hosting of the Debian package archive [#2817199] => 3 comments, 2 IRC mentions |
| [06:14:02] | <ergonlogic> | helmo, can you point me to where $application_name is set that way? |
| [06:14:50] | <helmo> | ergonlogic: in provision http/Provision/Service/http/apache.php |
| [06:15:19] | <ergonlogic> | right, just found it |
| [06:17:33] | <colan> | there's also the do-everything-in-gitlab solution. might be easier than dealing with suse? |
| [06:18:01] | <colan> | unless they delete their production DB again. ;) |
| [06:18:36] | <ergonlogic> | gitlab offers a package repo now? |
| [06:19:39] | <helmo> | not that I've seen, mostly docker building packages |
| [06:20:04] | <helmo> | and we already have that .... as an intermediate step I could build the packages locally ... and upload them to a webserver |
| [06:20:16] | <colan> | haven't looked at the details, but what i posted in the ticket: https://about.gitlab.com/2016/10/12/automated-debian-package-build-with-... |
| [06:20:20] | <ergonlogic> | we could build the packages in gitlab-ci/travis, and upload them to the Suse repos |
| [06:20:24] | <helmo> | then we just need reprepro on a new server |
| [06:20:31] | <ergonlogic> | Suse doesn;t actually handle the build, iirc |
| [06:20:37] | <colan> | ah |
| [06:21:02] | <helmo> | ergonlogic: it's the open build service ... |
| [06:21:10] | <ergonlogic> | right |
| [06:21:21] | <helmo> | but you can uploade ready made packages |
| [06:26:16] | <ergonlogic> | fyi, here the result of some previous experimentation: https://build.opensuse.org/project/show/home:AegirProject |
| [06:30:21] | <ergonlogic> | helmo: re. application_name, it's mostly used to generate various filenames for the specific service. I guess the reason it's protected is to make it overrideable from child classes... |
| [06:32:37] | * boshtian has joined #aegir |
| [06:32:37] | * boshtian has quit (Client Quit) |
| [06:32:49] | <ergonlogic> | helmo: do you need to access it externally? |
| [06:33:27] | <helmo> | would we have to add a getter function? or can we do it another way? I need to check from a hook_form_alter |
| [06:33:47] | <ergonlogic> | getter function is probably best |
| [06:37:38] | <helmo> | Hmm I think I found an existing way in config_data() ... drush php-eval "print_r(d('@server_master')->service('http')->config_data()['application_name']);" |
| [06:37:41] | <ergonlogic> | helmo: can't you look in `$web_server->services['http']->type`? |
| [06:38:48] | <ergonlogic> | sorry, that'd be on the front-end |
| [06:38:53] | <helmo> | ergonlogic: then I would have to account for some variations ... https_apache vs apache |
| [06:43:13] | <helmo> | about the deb packages, I'll dig through the docker code we use now for the dev packages ... and will try to simulate a stable release. |
| [06:43:51] | <helmo> | The 3.10 release (next month) will be a good target to try and reach |
| [06:45:52] | <helmo> | The other major thing we have on Zeus is Jenkins ... jonpugh can you find some time to get Travis 'over the finish line'? It feels like it's almost there to surpass Jenkins |
| [06:47:13] | <helmo> | Travis is currently failing on a sudo issue ...https://travis-ci.org/aegir-project/provision |
| [06:48:36] | <ergonlogic> | helmo: re. ip whitelisting, it seem like `if (in_array($type, ['apache', 'https_apache', 'apache_ssl'])) ...` seems more straight-forward than try to get $application_name from the backend |
| [06:51:24] | <helmo> | ergonlogic: hmmm, yes that isn't as bad as I imagined |
| [06:51:59] | * mengi has quit (Ping timeout: 256 seconds) |
| [06:52:36] | * mengi has joined #aegir |
| [07:01:05] | <helmo> | Any Coop stuff? #aegir-coop |
| [07:45:25] | * theMusician has quit (Quit: theMusician) |
| [08:25:52] | * ybabel has joined #aegir |
| [08:26:17] | * ybabel has quit (Client Quit) |
| [09:01:31] | * spyd has quit (Remote host closed the connection) |
| [09:03:47] | * fatguylaughing has quit (Quit: fatguylaughing) |
| [09:14:13] | * spyd has joined #aegir |
| [09:59:17] | * theMusician has joined #aegir |
| [10:13:32] | * v20th has quit (Quit: Leaving) |