[09:18:36] 10Blocked-on-schema-change, 10DBA, 10Wikidata: Schema change on production for increase the size of wbt_text_in_lang.wbxl_language - https://phabricator.wikimedia.org/T237120 (10jcrespo) p:05Triage→03Normal Thanks, acking this, but I hope this is not an emergency, as we may take a bit more than usual to... [09:34:25] 10DBA, 10CPT Initiatives (Core REST API in PHP), 10Core Platform Team Workboards (Green): Compose query for minor edit count - https://phabricator.wikimedia.org/T235572 (10jcrespo) > We want to move forward with validating the index approach as our main priority Yes, I think I will be able to do this this w... [09:34:47] 10DBA, 10CPT Initiatives (Core REST API in PHP), 10Core Platform Team Workboards (Green): Compose query for minor edit count - https://phabricator.wikimedia.org/T235572 (10jcrespo) a:03jcrespo [09:57:35] 10DBA, 10Operations, 10serviceops, 10Goal, 10Patch-For-Review: Switchover backup director service from helium to backup1001 - https://phabricator.wikimedia.org/T236406 (10jcrespo) Status at the moment: ` == jobs_with_all_failures (6) == an-master1002.eqiad.wmnet-Monthly-1st-Mon-production-hadoop-namen... [10:34:10] 10DBA, 10TechCom-RFC: MediaWiki database policy and/or guidelines (2019) - https://phabricator.wikimedia.org/T220056 (10jcrespo) I actually had a change of heart. This was already approved in 2015 with a large and open for suggestions workflow- RFC. If someone has concrete things to disagree with, all policy s... [10:55:57] 10DBA, 10Operations, 10serviceops, 10Goal, 10Patch-For-Review: Switchover backup director service from helium to backup1001 - https://phabricator.wikimedia.org/T236406 (10jcrespo) * `an-master1002.eqiad.wmnet-Monthly-1st-Mon-production-hadoop-namenode-backup`: connectivity issue bacula client: ` Nov 04 0... [11:06:35] 10DBA, 10Operations, 10serviceops, 10Goal, 10Patch-For-Review: Switchover backup director service from helium to backup1001 - https://phabricator.wikimedia.org/T236406 (10jcrespo) @elukey @Ottomata Re: matomo1001, is there a reason not to have daily incrementals? If the reason is that it generates a full... [11:07:33] 10DBA, 10Operations, 10serviceops, 10Goal, 10Patch-For-Review: Switchover backup director service from helium to backup1001 - https://phabricator.wikimedia.org/T236406 (10jcrespo) [11:07:55] 10DBA, 10Operations, 10serviceops, 10Goal, 10Patch-For-Review: Switchover backup director service from helium to backup1001 - https://phabricator.wikimedia.org/T236406 (10jcrespo) [11:42:02] 10DBA, 10Operations, 10serviceops, 10Goal, 10Patch-For-Review: Switchover backup director service from helium to backup1001 - https://phabricator.wikimedia.org/T236406 (10elukey) >>! In T236406#5630677, @jcrespo wrote: > @elukey @Ottomata Re: matomo1001, is there a reason not to have daily incrementals?... [11:47:25] 10DBA, 10Operations, 10serviceops, 10Goal, 10Patch-For-Review: Switchover backup director service from helium to backup1001 - https://phabricator.wikimedia.org/T236406 (10jcrespo) @elukey If this helps, I can try generating manually an incremental, for a better informed decision about storage size (it sh... [11:55:25] 10DBA, 10Operations, 10serviceops, 10Goal, 10Patch-For-Review: Switchover backup director service from helium to backup1001 - https://phabricator.wikimedia.org/T236406 (10elukey) I am all for simplifying and standardizing confs, so no opposition about incremental. Only one question - what would it change... [11:56:41] 10DBA, 10Operations, 10serviceops, 10Goal, 10Patch-For-Review: Switchover backup director service from helium to backup1001 - https://phabricator.wikimedia.org/T236406 (10jcrespo) ` # check_bacula.py matomo1001.eqiad.wmnet-Weekly-Wed-production-mysql-srv-backups 2019-10-30 02:05:43: type: F, status: T, b... [11:58:49] 10DBA, 10Operations, 10serviceops, 10Goal, 10Patch-For-Review: Switchover backup director service from helium to backup1001 - https://phabricator.wikimedia.org/T236406 (10jcrespo) >>! In T236406#5630885, @elukey wrote: > I am all for simplifying and standardizing confs, so no opposition about incremental... [12:00:11] 10DBA, 10Operations, 10serviceops, 10Goal, 10Patch-For-Review: Switchover backup director service from helium to backup1001 - https://phabricator.wikimedia.org/T236406 (10jcrespo) Let me find where they are configured and I will send you a patch- later feel free to ping me on IRC and I will show you how... [12:19:39] 10Blocked-on-schema-change, 10DBA, 10Wikidata: Schema change on production for increase the size of wbt_text_in_lang.wbxl_language - https://phabricator.wikimedia.org/T237120 (10Ladsgroup) Thanks. We got lucky and it's not a big blocker for us right now. [12:20:17] 10Blocked-on-schema-change, 10DBA, 10Wikidata: Schema change on production for increase the size of wbt_text_in_lang.wbxl_language - https://phabricator.wikimedia.org/T237120 (10jcrespo) a:03Marostegui [12:28:30] akosiaris: now I understand the comments on backup schedules [12:28:43] they are a bit arbitrary now that I see them- [12:28:55] weekly backups only do full weekly backups [12:29:05] montly backups do daily incrementals [12:29:27] hourly backups do weekly fuls, aside from hourly incrementals [12:29:56] yeah, not always well named. IIRC the term denotes when the full happens [12:30:05] actually, that is ok [12:30:23] it is that I would expect equivalent incremental policy [12:30:28] but as long as you are ok with a new full happening, they can be changed [12:30:40] I think in the long term I will just parse the schedules [12:30:51] but for now I will patch my hardcoded check [12:31:19] for which I didn't relly understand your original coments [12:31:26] and assumed the wrong assumptions [12:33:08] keep in mind that the schedules part is the most flexible configuration term that exists in bacula. And hence a real pain to do in puppet [12:33:20] oh, not complaining about puppet [12:33:43] the puppet implementation is actually really nice [12:33:51] (your implementation) [12:34:05] it is the names that get me confused [12:34:12] heh, I do see many issues with it tbh. I assume you will too over time [12:34:18] but thanks anyway [12:34:29] and in fact, it may show some lack of coverage of database backups [12:35:01] or missunderstanding [12:35:23] a Monthly backup will have daily granurality [12:35:38] but a weekly backup will only have weekly granaularlity [12:35:51] do you see my confussion? [12:37:00] once you understand it, it is ok, they are just different schedules with full backups with the given name [12:39:23] actually, that is wrong- hourly only do fulls weekly, so you do not follow your own rules, akosiaris :-P [12:39:45] but that is ok, just it is confusing at first [12:41:34] 10DBA, 10Operations, 10serviceops, 10Goal, 10Patch-For-Review: Switchover backup director service from helium to backup1001 - https://phabricator.wikimedia.org/T236406 (10jcrespo) @elukey I am sorry, after looking closely to the policies, I mistakenly assumed the schedule was wrong. I will abandon patchi... [12:42:25] probably some compromises were done due to storage constraints that maybe now can be lifted [12:53:02] jynus: ah yeah, I see what you mean, hourly has the same stanza as weekly [12:53:11] runs => [ [12:53:11] { 'level' => 'Full', [12:53:11] 'at' => "${name} at 02:05", [12:53:11] }, [12:53:22] ah, I know why, the ${name} there is the day [12:53:29] see my immediate patch [12:53:48] the granularity of backup::schedule has issued with going below that level [12:53:59] unless you think the policy themselve was not intended [12:54:14] I summarize the current state here: https://gerrit.wikimedia.org/r/c/operations/puppet/+/548244 [12:55:17] I have not checked Montly create daily incrementasl, can check on live past jobs [12:59:13] yes, it tries to do daily incrementals, which for databases may be too much (but has no real effect as dumps happen weekly) [12:59:19] * Hourly: Full weekly, incremental hourly +1 day, +3 hours of buffer * Weekly: Only fulls, weekly + 1 day of buffer * Monthly: Fulls monthly, diffs every other fortnite, incr. daily +1 day, +1 day of buffer [12:59:23] yeah that sums it up ^ [12:59:32] really badly named after all [12:59:33] if ok, +1 and I deploy [12:59:39] yeah :-D [12:59:55] specially because you kept insisting "it is the full policy" [12:59:57] :-P [13:00:38] I don't have a better proposal at the time, so I just want to deploy that and I may tweak the documentation [13:01:20] :( [13:01:29] why sad? [13:01:29] my naming is to blame indeed [13:01:34] nooooo [13:01:55] your effort beyond duty is why we still have backups [13:02:12] and I am just poking on everything you have done, so I have to apologize [13:02:58] well, supposedly it was within duty [13:03:39] Imagine youself trying to understand the 4 mariadb/mysql roles on production I inherited, and barely mantained! [13:04:11] I wasn't able to even finish all refactoring I had to do [13:05:44] I don't have to imagine. I tried [13:05:46] I gave up [13:05:51] ha ha [13:07:14] but then, someone come and says, puppet mariadb is horrible, and I cannot but agree! [13:11:53] 10DBA, 10Operations, 10Patch-For-Review, 10Puppet, 10User-jbond: Extend Puppet CA Expiry date - https://phabricator.wikimedia.org/T236277 (10Volans) One thing to take into account: we're using certificates signed by the Puppet CA in many places: - the puppet client certificate exposed via puppet code, se... [13:15:42] 10DBA, 10Operations, 10Patch-For-Review, 10Puppet, 10User-jbond: Extend Puppet CA Expiry date - https://phabricator.wikimedia.org/T236277 (10jcrespo) > - some services don't have a way to reload them and need restart. Notably MySQL has added this capability only in version 8 and Mariadb in version 10.4... [13:20:25] 10DBA, 10Operations, 10Patch-For-Review, 10Puppet, 10User-jbond: Extend Puppet CA Expiry date - https://phabricator.wikimedia.org/T236277 (10jbond) >>! In T236277#5631150, @jbond wrote: > We also need to consider `/usr/local/share/ca-certificates/Puppet_Internal_CA.crt` which is linked to `/etc/ssl/certs... [13:25:55] 10DBA, 10Operations, 10Patch-For-Review, 10Puppet, 10User-jbond: Extend Puppet CA Expiry date - https://phabricator.wikimedia.org/T236277 (10jcrespo) > The major pain point here is likely MySQL I am relatively sure that I didn't enabled strict cert checking because I knew this day would arrive (requires... [13:26:34] 10DBA, 10Data-Services, 10Operations, 10cloud-services-team (Kanban): Prepare and check storage layer for mnwwiki - https://phabricator.wikimedia.org/T235743 (10Urbanecm) >>! In T235743#5583420, @Marostegui wrote: > Let us know when the database is created so we can sanitize its tables and hand over to WMC... [13:29:02] 10DBA, 10Operations, 10Patch-For-Review, 10Puppet, 10User-jbond: Extend Puppet CA Expiry date - https://phabricator.wikimedia.org/T236277 (10akosiaris) **IMPORTANT**: The puppet CA cert (and correspondingly key), is used as a "master" (a failsafe in case the actual host key is not around) key for bacula... [13:31:19] 10DBA, 10Data-Services, 10Operations, 10cloud-services-team (Kanban): Prepare and check storage layer for mnwwiki - https://phabricator.wikimedia.org/T235743 (10jcrespo) a:03jcrespo Thanks, will apply the filtering and then assign to cloud for exposing it on wikireplicas. [13:35:56] 10DBA, 10CPT Initiatives (Core REST API in PHP), 10Core Platform Team Workboards (Green): Compose query for minor edit count - https://phabricator.wikimedia.org/T235572 (10WDoranWMF) @jcrespo ok thanks for the update. If we were to take an approach where we used the limited version of the query, along with... [13:40:06] 10DBA, 10Operations, 10serviceops, 10Goal, 10Patch-For-Review: Switchover backup director service from helium to backup1001 - https://phabricator.wikimedia.org/T236406 (10jcrespo) The matomo false alert is now correctly gone, only the 6 issues due to the 3 tickets above (T236406#5630631) left: ` All fail... [13:41:15] 10DBA, 10Operations, 10serviceops, 10Goal, 10Patch-For-Review: Switchover backup director service from helium to backup1001 - https://phabricator.wikimedia.org/T236406 (10elukey) Nice thanks! Just pushed the new rules to the routers, so in theory an-master1002 and analytics1029 should go away now! Let me... [13:41:24] 10DBA, 10Operations, 10Patch-For-Review, 10Puppet, 10User-jbond: Extend Puppet CA Expiry date - https://phabricator.wikimedia.org/T236277 (10jbond) >>! In T236277#5631271, @akosiaris wrote: > **IMPORTANT**: The puppet CA cert (and correspondingly key), is used as a "master" (a failsafe in case the actual... [13:43:48] https://wikitech.wikimedia.org/w/index.php?title=Bacula&type=revision&diff=1843692&oldid=1841816 [13:43:51] 10DBA, 10Operations, 10serviceops, 10Goal, 10Patch-For-Review: Strengthen backup infrastructure and support - https://phabricator.wikimedia.org/T229209 (10akosiaris) [13:47:00] 10DBA, 10Operations, 10Patch-For-Review, 10Puppet, 10User-jbond: Extend Puppet CA Expiry date - https://phabricator.wikimedia.org/T236277 (10akosiaris) >>! In T236277#5631321, @jbond wrote: >>>! In T236277#5631271, @akosiaris wrote: >> **IMPORTANT**: The puppet CA cert (and correspondingly key), is used... [13:50:15] 10DBA, 10Operations, 10serviceops, 10Goal, 10Patch-For-Review: Switchover backup director service from helium to backup1001 - https://phabricator.wikimedia.org/T236406 (10jcrespo) Forcing a manual run on the 2 above for validation. [14:23:10] 10DBA, 10Operations, 10Patch-For-Review, 10Puppet, 10User-jbond: Extend Puppet CA Expiry date - https://phabricator.wikimedia.org/T236277 (10CDanis) In terms of identifying services that use keys issued by the puppet CA -- is it wrong to think that the following would be a complete list? * keys created u... [14:28:19] 10DBA, 10CPT Initiatives (Core REST API in PHP), 10Core Platform Team Workboards (Green): Compose query for minor edit count - https://phabricator.wikimedia.org/T235572 (10jcrespo) Option 1 would work, although I understand it is a big limitation. A variant way would be to run the query with a limit, but I a... [14:31:46] 10DBA, 10Operations, 10serviceops, 10Goal, 10Patch-For-Review: Switchover backup director service from helium to backup1001 - https://phabricator.wikimedia.org/T236406 (10jcrespo) :-) ` All failures: 4 (bromine, ...), Fresh: 90 jobs ` Unsubbing elukey and Otto to prevent unwanted spam (feel free to resub... [14:36:12] 10DBA, 10Operations, 10Puppet, 10User-jbond: Document all uses of the puppetCA certificate - https://phabricator.wikimedia.org/T237259 (10jbond) [14:36:29] 10DBA, 10Operations, 10Puppet, 10User-jbond: Document all uses of the puppetCA certificate - https://phabricator.wikimedia.org/T237259 (10jbond) p:05Triage→03Normal [14:37:07] 10DBA, 10CPT Initiatives (Core REST API in PHP), 10Core Platform Team Workboards (Green): Compose query for minor edit count - https://phabricator.wikimedia.org/T235572 (10BPirkle) Regarding rate limiting, @Anomie pointed me at [[ https://gerrit.wikimedia.org/g/mediawiki/core/+/b19bbb5e5f2f60159a1781ccfd42b8... [14:42:45] 10DBA, 10Operations, 10Patch-For-Review, 10Puppet, 10User-jbond: Extend Puppet CA Expiry date - https://phabricator.wikimedia.org/T236277 (10jbond) >>! In T236277#5631496, @CDanis wrote: > In terms of identifying services that use keys issued by the puppet CA -- is it wrong to think that the following wo... [14:44:01] 10DBA, 10Operations, 10Puppet, 10User-jbond: Document all uses of the puppetCA certificate - https://phabricator.wikimedia.org/T237259 (10jbond) base::expose_puppet_certs - This can be ignored as it relates to the client key pairs and not the CA public certificate which is the file which is changing [15:02:08] 10DBA, 10CPT Initiatives (Core REST API in PHP), 10Core Platform Team Workboards (Green): Compose query for minor edit count - https://phabricator.wikimedia.org/T235572 (10eprodromou) >>! In T235572#5631512, @jcrespo wrote: > if that is on server side, a "DOS as a service code endpoint"- hit here 60 times t... [15:10:59] 10Blocked-on-schema-change, 10DBA, 10Product-Infrastructure-Team-Backlog, 10Wikipedia-Android-App-Backlog (Android-app-release-v2.7.29x-N-Nanaimo-Bar): Schema change for T234955 - add column wetc_revert_count to wikimedia_editor_tasks_counts - https://phabricator.wikimedia.org/T237264 (10Mholloway) [15:14:15] 10DBA, 10Operations, 10Patch-For-Review, 10Puppet, 10User-jbond: Extend Puppet CA Expiry date - https://phabricator.wikimedia.org/T236277 (10Volans) @CDanis the problem is that all of those identify clients, while for the CA validation we're mostly interested in the server side. So while that surely woul... [15:25:03] 10DBA, 10CPT Initiatives (Core REST API in PHP), 10Core Platform Team Workboards (Green): Compose query for minor edit count - https://phabricator.wikimedia.org/T235572 (10BPirkle) >>! In T235572#5631682, @eprodromou wrote: > > This is server-side code, just to keep this particular endpoint from bringing do... [15:49:22] How long does a full restore from dump of the English encyclopedia take? [15:50:18] I hope that is a theoretical question or not for wikipedia production :-P [15:50:45] xmldumps you mean, or backups? [15:58:17] 10DBA, 10Operations, 10Puppet, 10User-jbond: Document all uses of the puppetCA certificate - https://phabricator.wikimedia.org/T237259 (10jbond) [16:04:23] 10DBA, 10CPT Initiatives (Core REST API in PHP), 10Core Platform Team Workboards (Green): Compose query for minor edit count - https://phabricator.wikimedia.org/T235572 (10eprodromou) >>! In T235572#5631866, @BPirkle wrote: > > # We will not enable the REST API on production until the index is in place T... [16:13:21] 10DBA, 10CPT Initiatives (Core REST API in PHP), 10Core Platform Team Workboards (Green): Compose query for minor edit count - https://phabricator.wikimedia.org/T235572 (10BPirkle) Option 1 seems to have the fewest question marks and should be straightforward to implement: >We simply refuse to execute the q... [16:13:24] 10DBA, 10CPT Initiatives (Core REST API in PHP), 10Core Platform Team Workboards (Green): Compose query for minor edit count - https://phabricator.wikimedia.org/T235572 (10Pchelolo) > If we can't, then let's simply return an error for now and fix it when we have an index in a few weeks. If we do absolutely... [16:13:43] Theoretical. I'm considering how to make a graph database of all the internal links for all article revisions, for all languages of Wikipedia, and then do science to it. If I can get access to backups instead of XML dumps, then backups. [16:20:10] 10DBA, 10CPT Initiatives (Core REST API in PHP), 10Core Platform Team Workboards (Green): Compose query for minor edit count - https://phabricator.wikimedia.org/T235572 (10eprodromou) >>! In T235572#5632026, @Pchelolo wrote: > If we do absolutely nothing, it will return 400 currently. Is it ok? No, I'd lik... [16:20:11] sorry, backups are private only because they contain passwords and ips [16:20:44] a full recovery of all xmldumps with history may take a long time [16:21:15] but if you need only partial data it should be as fast as you can parse the few TB of wikitext they contain [16:21:46] nos: if I may direct you to a better place for asking questions about xmldumps [16:21:57] yes, please [16:22:18] this mailinglist will have the xmldumps maintainer and many of its consumers: https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l [16:22:31] so you can ask practical tips there [16:22:39] (tools, times, size, etc.) [16:22:45] Brilliant. Thank you. =) [16:23:17] there is also a research and analytics list that may be interesting for you [16:23:37] hopefuly they will be able to help you there better [16:25:14] I'm sure I won't be the first one with this idea. :) [16:26:29] indeed, in fact links posparsed are a separete download [16:26:40] let me search it real quick [16:28:10] this is a snapshot of the pagelinks table, which says which pages links where: https://dumps.wikimedia.org/enwiki/20191101/enwiki-20191101-pagelinks.sql.gz [16:30:38] I belive that is how they implemented things like https://www.sixdegreesofwikipedia.com/ [16:30:54] but please ask on the appropiate list, as that would be mostly offtopic on this channel [16:31:02] This is very convenient... And I find a lot of other useful information broken up on the root of that link. [16:31:24] Again, thank you =) [16:31:40] your welcome [16:32:23] *you're [18:30:37] 10DBA, 10Operations, 10Puppet, 10User-jbond: Document all uses of the puppetCA certificate - https://phabricator.wikimedia.org/T237259 (10Krenair) Acme-chief nginx config probably? [18:53:17] 10DBA, 10Operations, 10serviceops, 10Goal, 10Patch-For-Review: Switchover backup director service from helium to backup1001 - https://phabricator.wikimedia.org/T236406 (10Dzahn) [21:21:33] 10DBA, 10Operations, 10serviceops, 10Goal, 10Patch-For-Review: Switchover backup director service from helium to backup1001 - https://phabricator.wikimedia.org/T236406 (10jcrespo) Down to 2: ` All failures: 2 (cloudweb2001-dev, ...), Fresh: 90 jobs ` Which should be fixed when cloud patch is reviewed an... [22:07:48] 10DBA, 10CPT Initiatives (Core REST API in PHP), 10Core Platform Team Workboards (Green): Compose query for minor edit count - https://phabricator.wikimedia.org/T235572 (10BPirkle) https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/545590/ was merged, with the "check edit count before executing minor edit c...