[00:01:24] !log tools Rebuilt python and python2 Docker images (T157744) [00:01:30] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [00:01:31] T157744: tools.spiarticleanalyzer: requesting installation of icu on bastion and kubernetes - https://phabricator.wikimedia.org/T157744 [00:02:46] 10Tool-Labs-tools-Other, 10Tools-Kubernetes, 13Patch-For-Review: tools.spiarticleanalyzer: requesting installation of icu on bastion and kubernetes - https://phabricator.wikimedia.org/T157744#3027802 (10bd808) >>! In T157744#3027599, @JustBerry wrote: > @bd808 Thanks for merging. Do(es) particular container(... [00:04:04] 06Labs, 10Tool-Labs, 10Tools-Kubernetes, 07Tracking: Packages to be installed in Tool Labs Kubernetes Images (Tracking) - https://phabricator.wikimedia.org/T140110#3027825 (10bd808) [00:04:07] 10Tool-Labs-tools-Other, 10Tools-Kubernetes, 15User-bd808: tools.spiarticleanalyzer: requesting installation of icu on bastion and kubernetes - https://phabricator.wikimedia.org/T157744#3027820 (10bd808) 05Open>03Resolved a:03bd808 Please open a new ticket to track additional issues beyond the libicu i... [00:36:50] 10Tool-Labs-tools-Other: Fix tool kmlexport - https://phabricator.wikimedia.org/T92963#3027991 (10bd808) >>! In T92963#3027208, @Thgoiter wrote: > He did. "Do what you want" = public domain. According to https://en.wikipedia.org/wiki/Public-domain_software (not an official legal opinion, but a reputable seconda... [01:30:33] 06Labs, 10Labs-Infrastructure, 13Patch-For-Review: Review OpenStack monitoring options w/out Mirantis packages - https://phabricator.wikimedia.org/T157760#3015592 (10faidon) Thank you so much fixing this so quickly! (FWIW I don't know much about those checks, but running those with NRPE sounds indeed like a... [01:32:27] bd808: no problems in opening another ticket, but there's no way I can restart from my end, right? [01:32:48] JustBerry: restart what? [01:32:49] i.e. restarting the container (not the webservice) [01:32:52] bd808: ^^ [01:33:06] !phab T157744 > bd808 [01:33:06] T157744: tools.spiarticleanalyzer: requesting installation of icu on bastion and kubernetes - https://phabricator.wikimedia.org/T157744 [01:33:09] the webservice command starts/stops containers [01:33:28] bd808: Why did you make a ticket then? (sorry, was confused) [01:33:30] the "webservice" *is* a container [01:33:49] bd808: I realize, that's why I wasn't sure why you said open another ticket [01:33:52] I thought maybe there was something else [01:34:04] I meant if you need more help debugging other things [01:34:09] bd808: Ah sure, thanks. [02:47:52] PROBLEM - Free space - all mounts on tools-exec-gift is CRITICAL: CRITICAL: tools.tools-exec-gift.diskspace._public_dumps.byte_percentfree (No valid datapoints found)tools.tools-exec-gift.diskspace.root.byte_percentfree (<22.22%) [03:02:35] bd808: same traceback for the icu lib issue [03:06:29] 10Tool-Labs-tools-Pageviews: Add Mediaviews to Pageviews suite - https://phabricator.wikimedia.org/T149642#3028268 (10MusikAnimal) @Harej @harej-NIOSH (sorry don't know which one to ping), I have finally started to work on this. I'm getting around the cross-origin policy by disabling web security in the browser,... [03:09:19] 10Tool-Labs-tools-Other: tool.spiarticleanalyzer: webservice stop yields ValueError: get() more than one object - https://phabricator.wikimedia.org/T158152#3028275 (10JustBerry) [03:09:39] 10Tool-Labs-tools-Pageviews: Add Mediaviews to Pageviews suite - https://phabricator.wikimedia.org/T149642#3028288 (10MusikAnimal) Also, to be clear, we're only supporting playable audio and video, and not images? [03:10:35] 06Labs, 10Tool-Labs-tools-Other, 06Community-Tech-Tool-Labs, 06Developer-Relations, and 3 others: Create an authoritative and well promoted catalog of Wikimedia tools - https://phabricator.wikimedia.org/T115650#1729232 (10bd808) I have planned work for #striker that is related ({T149458}). From that task:... [03:13:34] 06Labs, 10Tool-Labs, 10Tools-Kubernetes, 13Patch-For-Review: k8s webservice restart failure with `ValueError: get() more than one object; use filter` - https://phabricator.wikimedia.org/T156626#3028294 (10JustBerry) 05Resolved>03Open `webservice stop` in `~` (`(venv)tools.spiarticleanalyzer@tools-basti... [03:16:44] 10Tool-Labs-tools-Other, 07Tracking: Issues related to tool.spiarticleanalyzer - https://phabricator.wikimedia.org/T157767#3028300 (10JustBerry) [03:16:46] 10Tool-Labs-tools-Other: tool.spiarticleanalyzer: webservice stop yields ValueError: get() more than one object - https://phabricator.wikimedia.org/T158152#3028296 (10JustBerry) 05Open>03stalled Similar to T156626. Reinitiated discussion there. Case status stalled per associated ticket. [03:34:13] 10Tool-Labs-tools-Other, 07Tracking: Issues related to tool.spiarticleanalyzer - https://phabricator.wikimedia.org/T157767#3028306 (10JustBerry) [03:34:15] 06Labs, 10Tool-Labs, 10Tools-Kubernetes, 07Tracking: Packages to be installed in Tool Labs Kubernetes Images (Tracking) - https://phabricator.wikimedia.org/T140110#3028307 (10JustBerry) [03:34:18] 10Tool-Labs-tools-Other, 10Tools-Kubernetes, 15User-bd808: tools.spiarticleanalyzer: requesting installation of icu on bastion and kubernetes - https://phabricator.wikimedia.org/T157744#3028304 (10JustBerry) 05Resolved>03Open @bd808 @yuvipanda `webservice --backend=kubernetes python2 shell` `python` `i... [03:38:41] 10Tool-Labs-tools-Other, 10Tools-Kubernetes, 15User-bd808: tools.spiarticleanalyzer: requesting installation of icu on bastion and kubernetes - https://phabricator.wikimedia.org/T157744#3028308 (10yuvipanda) python-pyicu won't be installed - you should install pyicu inside your virtualenv. We don't want to p... [03:44:35] 06Labs, 10Tool-Labs-tools-Other, 06Community-Tech-Tool-Labs, 06Developer-Relations, and 3 others: Create an authoritative and well promoted catalog of Wikimedia tools - https://phabricator.wikimedia.org/T115650#3028309 (10Tgr) Ideally such a catalog would include all tools (users don't care that much about... [03:48:56] 10Tool-Labs-tools-Other, 07Tracking: Issues related to tool.spiarticleanalyzer - https://phabricator.wikimedia.org/T157767#3028313 (10bd808) [03:48:59] 06Labs, 10Tool-Labs, 10Tools-Kubernetes, 07Tracking: Packages to be installed in Tool Labs Kubernetes Images (Tracking) - https://phabricator.wikimedia.org/T140110#3028314 (10bd808) [03:49:02] 10Tool-Labs-tools-Other, 10Tools-Kubernetes, 15User-bd808: tools.spiarticleanalyzer: requesting installation of icu on bastion and kubernetes - https://phabricator.wikimedia.org/T157744#3028311 (10bd808) 05Open>03Resolved == Kubernetes & Python 2 == ``` tools.bd808-test@tools-bastion-02:~$ webservice --b... [04:14:45] bd808: thanks [04:20:20] 06Labs, 10Tool-Labs, 10Tools-Kubernetes, 13Patch-For-Review: k8s webservice restart failure with `ValueError: get() more than one object; use filter` - https://phabricator.wikimedia.org/T156626#3028335 (10bd808) The patched version still has the problem of failing if `pod = self._find_obj(pykube.Pod, self.... [04:20:52] bd808: around? I'm running into a weird python error I've never seen before... I created a virtualenv on tools-dev, pip install youtube-dl, then tried executing that from a exec node and it's failing on some C symbol related thing: https://paste.fedoraproject.org/558713/87132378/raw/ [04:21:10] * bd808 looks [04:22:50] legoktm: maybe a python 3.4 bug? -- https://stackoverflow.com/questions/33223713/python-ctypes-import-error-in-virtualenv [04:23:21] hmm [04:23:25] is this something that could run on Kubernetes? [04:23:52] bd808: webservice stop: Your webservice is not running [04:24:00] webservice --backend=kubernetes python2 start: Your job is already running [04:24:00] can you run cron job scripts on k8s? [04:24:04] tried few times already [04:24:20] nvm [04:24:25] tried webservice --backend=kubernetes python2 stop [04:24:28] seemed to work [04:24:39] legoktm: not yet, unfortunately. [04:24:43] legoktm: hmmm... probably not [04:24:59] bd808: the version of k8s we run has built in cronjob support [04:25:11] https://kubernetes.io/docs/user-guide/cron-jobs/ [04:25:12] bd808: btw ditched a commit I made for python-pyicu [04:26:07] bd808: how do I point webservice to venv-python2-icu and not venv [04:26:34] or should i rename the dir venv? followed example from task bd808 [04:26:39] rename it [04:26:41] JustBerry: just install the package in your normal venv [04:26:59] I was not giving cut-n-paste instructions [04:27:00] bd808: it'll read TWO vens...? [04:27:08] I'll try building the virtualenv on an exec node? I assume all trusty exec nodes will have the exact same python3 version [04:27:21] legoktm: yeah they will be the same [04:28:02] JustBerry: I really can't help you any more on these things. I don't have the time to write your entire tool and let you copy it. [04:28:23] bd808: The deps I need aren't installed [04:28:36] I'm trying to create a workaround [04:28:46] JustBerry: yes. they are. That's what my pastes showed. [04:29:01] at lest libicu is [04:29:25] bd808: yes, but then that method leads to two vens and webservice not able to read correctly [04:29:37] no. it does not [04:30:26] I was just showing that it was available in both python2 and python3 containers [04:31:19] bd808: virtualenv doesn't matter? [04:31:47] correct [04:32:58] JustBerry: I think you may need to spend some time reading about basic use of virtual environments. you seem to be pretty confused about them. [04:33:36] try http://docs.python-guide.org/en/latest/dev/virtualenvs/ [04:33:46] bd808: not really. the issue here is that the VIRTUAL_ENV is set to www/python/venv [04:33:48] not that dir [04:34:01] that is the venv you should be using [04:34:28] JustBerry: webservice enforces that convention, so your easiest bet (unless you are comfortable writing your own uwsgi config files) is to follow it. [04:34:35] the paste I made was a proof of working install, not instructions for you to cut and paste [04:35:32] you need to use your www/python/venv and install pyicu [04:35:33] bd808: virtualenv venv-python2-icu seems to be the only different step between what I did and your instructions [04:35:54] and using venv-python2-icu as the dirname [04:35:58] you installed the package? [04:36:11] the directory name is irrelevant [04:36:20] I promise [04:38:10] bd808: I know it seems stupid. I just thought there may be an internal customization to initialize different versions of the venv based on dirname--but that's cleared up now [04:38:26] bd808: I've had pyicu installed in every single venv I've used [04:38:35] p.s. bastion pyicu is oudated [04:41:26] JustBerry: so you installed pyicu in your kubernetes venv after the required library was installed in the Docker containers this afternoon? [04:41:53] or you tried to install it before then and it failed to build because of the missing library? [04:42:37] bd808: I'm trying it again just to double check. [04:42:54] p.s. packages on Trusty are the versions that are shipped by Ubuntu. We don't control that and that is why we are recommending virtual envs for all new development. [04:43:55] bd808: yeah icu was getting trippy though (glad we presumably got that resolved) [04:52:48] 06Labs, 10Tool-Labs, 10Tools-Kubernetes: Allow running cronjobs on k8s - https://phabricator.wikimedia.org/T158155#3028350 (10yuvipanda) [04:55:40] bd808: it. works. [04:55:42] . [04:55:57] almost like I'm not a liar [04:56:52] 10Tool-Labs-tools-Other, 10Tools-Kubernetes, 15User-bd808: tools.spiarticleanalyzer: requesting installation of icu on bastion and kubernetes - https://phabricator.wikimedia.org/T157744#3028364 (10JustBerry) Commit referenced above abandoned. Task resolved. [04:58:03] bd808: I think I had to rebuild venv and such (and never said you were LYING) [04:58:07] but thank you ^^ [04:58:12] yuvipanda: thank you to you too [04:59:13] JustBerry: yw. [05:12:24] 06Labs, 10Tool-Labs-tools-Other, 06Community-Tech-Tool-Labs, 06Developer-Relations, and 3 others: Create an authoritative and well promoted catalog of Wikimedia tools - https://phabricator.wikimedia.org/T115650#3028370 (10bd808) >>! In T115650#3028309, @Tgr wrote: > Ideally such a catalog would include all... [05:21:44] 06Labs, 10Tool-Labs-tools-Other, 06Community-Tech-Tool-Labs, 06Developer-Relations, and 3 others: Create an authoritative and well promoted catalog of Wikimedia tools - https://phabricator.wikimedia.org/T115650#3028372 (10Tgr) >>! In T115650#3028370, @bd808 wrote: > This class of problem is the sort of thi... [05:42:27] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Dejavu was created, changed by Dejavu link https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Access_Request/Dejavu edit summary: Created page with "{{Tools Access Request |Justification=I want to create statics tools |Completed=false |User Name=Dejavu }}" [05:52:59] 06Labs, 10Tool-Labs-tools-Other, 06Community-Tech-Tool-Labs, 06Developer-Relations, and 3 others: Create an authoritative and well promoted catalog of Wikimedia tools - https://phabricator.wikimedia.org/T115650#3028377 (10bd808) "being developed now for Commons" with an expected first ship date about a yea... [06:06:28] 06Labs, 10Tool-Labs-tools-Other, 06Community-Tech-Tool-Labs, 06Developer-Relations, and 3 others: Create an authoritative and well promoted catalog of Wikimedia tools - https://phabricator.wikimedia.org/T115650#3028382 (10Tgr) Sure but federation is not the main blocker for that (MCR is). [06:23:19] 06Labs, 10Labs-Infrastructure: Deprecate precise instances in Labs by 03/31/2017 - https://phabricator.wikimedia.org/T143349#2946813 (10Joe) >>! In T143349#3025145, @MoritzMuehlenhoff wrote: > I think we should simply drop 5.3 from the CI tests, then. I wasn't aware that the PHP versions had to be co-installab... [07:00:10] 06Labs, 10DBA, 13Patch-For-Review: Add and sanitize s2, s4, s5, s6 and s7 to sanitarium2 and new labsdb hosts - https://phabricator.wikimedia.org/T153743#3028418 (10Marostegui) >>! In T153743#3026114, @jcrespo wrote: > I've added a workaround that makes no sense but that works for now ,we need to revisit it... [07:19:12] 06Labs, 10Labs-Infrastructure, 10DBA, 06Operations, 07User-notice: labsdb1005 (mysql) maintenance for reimage - https://phabricator.wikimedia.org/T157358#3002516 (10Marostegui) ``` 04:23 < yuvipanda> marostegui: jynus I can verify that I can access labsdb1004 from tools, so no need to massage VLANs or fi... [07:38:04] 10Tool-Labs-tools-Other: Fix tool kmlexport - https://phabricator.wikimedia.org/T92963#3028428 (10Thgoiter) I don't want to argue about licenses, just told my opinion. My intention and wish is to ensure that the functionality of kmlexport won't be lost someday. A tool that is linked from hundreds of thousands o... [09:38:24] 06Labs, 10Labs-Infrastructure, 10DBA, 06Operations, 07User-notice: labsdb1005 (mysql) maintenance for reimage - https://phabricator.wikimedia.org/T157358#3028611 (10Marostegui) After a chat with Jaime we have moved those old databases in labsdb1005 to: `labsdb1005:/srv/tmp/old_dbs` . They didn't have an... [09:43:35] 06Labs, 10Tool-Labs-tools-Other, 06Community-Tech-Tool-Labs, 06Developer-Relations, and 3 others: Create an authoritative and well promoted catalog of Wikimedia tools - https://phabricator.wikimedia.org/T115650#3028620 (10Tgr) This was the fourth most popular item in the [[https://www.mediawiki.org/wiki/Us... [10:00:43] 06Labs, 10Analytics, 10DBA: Discuss labsdb visibility of rev_text_id and ar_comment - https://phabricator.wikimedia.org/T158166#3028656 (10JAllemandou) [10:39:43] 10Tool-Labs-tools-Xtools, 06Community-Tech: [PLAN] Move development for xtools from my repo to the project repo - https://phabricator.wikimedia.org/T158102#3028748 (10Niharika) [10:47:00] Anyone know why I might be getting back 'error 502 bad gateway' from http://tools-elastic-01.tools.eqiad.wmflabs. Not all the time but often enough. Is there someone who could look in the logs or somewhere I can look myself? [10:52:25] PROBLEM - Puppet run on tools-webgrid-lighttpd-1401 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [10:57:03] 06Labs, 10Tool-Labs, 10Tools-Kubernetes, 13Patch-For-Review: k8s webservice restart failure with `ValueError: get() more than one object; use filter` - https://phabricator.wikimedia.org/T156626#3028786 (10scfc) How can there be more than one `webservice` pod? That should be impossible to achieve, and IMNS... [11:32:26] RECOVERY - Puppet run on tools-webgrid-lighttpd-1401 is OK: OK: Less than 1.00% above the threshold [0.0] [11:32:53] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Dejavu was modified, changed by Tim Landscheidt link https://wikitech.wikimedia.org/w/index.php?diff=1518989 edit summary: [11:37:53] RECOVERY - Free space - all mounts on tools-exec-gift is OK: OK: tools.tools-exec-gift.diskspace._public_dumps.byte_percentfree (No valid datapoints found) [11:40:23] 06Labs, 10Tool-Labs: Puppet fails on tools-docker-builder-03 - https://phabricator.wikimedia.org/T157415#3028897 (10scfc) This is still happening. As https://wikitech.wikimedia.org/wiki/Tools_Kubernetes#Image_building now refers to `tools-docker-builder-04`, can `tools-docker-builder-03` be deleted? [13:01:19] 06Labs, 10Tool-Labs, 10Tools-Kubernetes, 13Patch-For-Review: k8s webservice restart failure with `ValueError: get() more than one object; use filter` - https://phabricator.wikimedia.org/T156626#3029015 (10JustBerry) @scfc No worries, but http://tools.wmflabs.org/spiarticleanalyzer/ seems to work fine for m... [13:01:26] 06Labs, 10Tool-Labs-tools-Other, 06Community-Tech-Tool-Labs, 06Developer-Relations, and 3 others: Create an authoritative and well promoted catalog of Wikimedia tools - https://phabricator.wikimedia.org/T115650#3029016 (10Dereckson) I'm really concerned Hay doesn't comment this task. The tool produced is... [13:21:10] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Udo T. was created, changed by Udo T. link https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Access_Request/Udo_T. edit summary: Created page with "{{Tools Access Request |Justification=I want to run my global interwiki bot [[:m:Special:CentralAuth/UT-interwiki-Bot|UT-interwiki-Bot]] on Tool Labs |Completed=false |User Na..." [13:34:58] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Udo T. was modified, changed by Udo T. link https://wikitech.wikimedia.org/w/index.php?diff=1520167 edit summary: corr. [13:55:34] Quick question relating to what would be appropriate for a lab project instance - we're looking at debugging why ExternalLinksChange (EventLogging) doesn't always log each change of external link and we need somewhere we can make some quick changes to a debug version of the spam blacklist extension (where the hook for ExternalLinksChange is). We initially [13:55:34] looked at the beta cluster but apparently it's more suited for staging and not as a "development playground" [13:56:14] would a project instance be suitable here? I imagine we could use vagrant to get a MW install up quickly and enable the eventlogging role [14:21:04] samtar: [14:21:12] samtar: yeah beta is more a staging/integration area [14:21:39] we usually don't live hack there nor do we cherry pick any patches. The intent is to run what ever is merged in the master branches [14:22:00] that being said [14:22:18] the MediaWiki configuration there can be adjusted via operations/mediawiki-config.git , there are a few files ending with -labs.php [14:22:46] that let you adjust settings for the beta cluster. For example adding some debug log bucket, registering a hook for debugging etc [14:24:17] hashar: Okay? I'm afraid my knowledge of the beta cluster is as limited to "not to be used as a development playground" :P [14:24:30] looks like Spamblack list extensions has a bunch of wfDebugLog( 'SpamBlacklist' [14:25:05] so in theory one could add a bunch of wfDebugLog to the extension, and enable that log bucket on beta [14:25:16] though beta cluster already logs all wfDebug things in a huge file [14:26:21] well we're interested in using the beta cluster for https://wikitech.wikimedia.org/wiki/Analytics/EventLogging/TestingOnBetaCluster#How_to_verify_events - but we'd also like somewhere similar where we can do some live changes [14:26:47] hashar: and I believe the logging you just mentioned is part of that "how to verify events" guide [14:27:38] samtar: you are new to the whole mess of mediawiki/wikimedia development are you ? [14:28:18] hashar: this side of things, very :) [14:28:35] do you have any past experience with mediawiki? [14:29:04] More on the configuration side of things, not extensions [14:29:36] so the whole grand scheme of the thing [14:29:47] whatever is in master branches must be production grade [14:30:02] on tuesday we blindly cut a branch ( wmf/xxx ) based on whatever is in the master branch [14:30:24] so all experimental code / debugging code is usually hidden behind a feature flag such as: [14:30:38] $wgSpamblacklistDebuggingFeature = false; [14:30:46] which is false by default ^^^ [14:31:01] then we can enable it on beta cluster ( via the -labs.php files in operations/Mediawiki-config.git ) [14:31:26] Ah right! [14:31:33] and later enable the feature in production solely for the test wiki ( via something like: wgSpamblacklistDebuggingFeature = [ 'testwiki' => true, 'default' => false } [14:31:46] and later do a rolling deploy of the feature wiki per wiki [14:31:59] eg enable it solely for nlwiki by just adding: 'nlwiki' => true [14:32:30] there are other nasty way such as doing A/B testing and having the feature only enabled for X percent of users or based on some criteria. But I am not familiar with that [14:32:46] for the log side of things [14:33:42] MediaWiki legacy logging uses wfDebugLog( bucket_name, message ) [14:33:58] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Udo T. was modified, changed by Tim Landscheidt link https://wikitech.wikimedia.org/w/index.php?diff=1520789 edit summary: [14:34:20] so for SpamBlacklist you can see in the code lines such as: wfDebugLog( 'SpamBlacklist', "Added URLs: " . implode( ', ', $addedLinks ) ); [14:34:33] iirc by default a log bucket (eg: SpamBlacklist) is NOT logged [14:35:29] So to get to a point of having some debugging code I should probably just work on a local instance? [14:35:46] we can enable it on beta [14:35:52] but yeah I guess most people just use a local instance [14:36:00] there is the mediawiki/vagrant project [14:36:09] that let you spawn a mediawiki locally inside a vagrant instance [14:36:19] and it has a bunch of helper to enable extensions and their backends [14:36:20] yeah, and I believe there is an eventlogging role? [14:36:49] most probbably [14:37:39] thanks for your help hashar :) [14:38:56] samtar: and if at some point you want to enable the SpamBlacklist log bucket on beta cluster [14:39:05] you want to clone form gerrit operations/mediawiki-config.git [14:39:08] https://github.com/wikimedia/operations-mediawiki-config/blob/0115097/wmf-config/InitialiseSettings-labs.php#L146-L156 [14:39:29] the wmgMonologChannels enable log buckets and the level [14:39:32] so most probably: [14:39:36] '+beta' => [ [14:39:46] 'SpamBlacklist' => 'debug', [14:39:46] .... [14:39:48] ] [14:40:59] then via some complicated chain the messages eventually ends up in an ElasticSearch cluster [14:41:17] and you can search the logs via: https://logstash-beta.wmflabs.org/app/kibana , you can try there searching for : channel:CentralAuthVerbose [14:41:29] which is the log bucket enabled above via wmgMonologChannels [14:41:55] production has the same http://logstash.wikimedia.org though access is restricted due to private informations (such as potentially IP of end users) [14:42:29] samtar: do poke me as needed. I am usually in #wikimedia-releng and #wikimedia-operations and my timezone is Europe (UTC+1) [14:43:27] hashar: that's really helpful thank you :) I'm UTC so that's good too [14:44:06] samtar: I am not sure in which IRC channels the other european devs hide :D [14:44:10] #wikimedia-mobile has some [14:44:27] and #mediawiki-core has a few others during our afternoon [14:50:38] 06Labs, 10labs-sprint-118, 10labs-sprint-119, 07Documentation: Document support levels for tools and labs projects - https://phabricator.wikimedia.org/T116598#3029290 (10chasemp) 05Open>03stalled a:05chasemp>03None @bd808 I am stalling this for now. Keeping it on the radar is a definite need but .... [15:05:49] 06Labs, 06Operations: Reimage labstore1001 and labstore1002 for DRBD storage setup - https://phabricator.wikimedia.org/T158196#3029409 (10chasemp) [15:06:19] 06Labs, 10Labs-Sprint-107: Ensure that labstore machine is 'known good' hardware - https://phabricator.wikimedia.org/T106479#3029428 (10chasemp) [15:06:21] How would I view the logs for tools-elastic-01.tools.eqiad.wmflabs? I'm getting back a 502 occasionally and I would like to find out why. [15:06:22] 06Labs, 10Labs-Infrastructure, 06DC-Ops, 06Operations, and 2 others: labstore1002 issues while trying to reboot - https://phabricator.wikimedia.org/T98183#3029424 (10chasemp) 05Open>03Resolved closed in favor of T158196 [15:07:54] 06Labs, 06Operations, 13Patch-For-Review, 07Tracking: overhaul labstore setup [tracking] - https://phabricator.wikimedia.org/T126083#3029452 (10chasemp) [15:08:04] 06Labs, 06Operations, 13Patch-For-Review, 07Tracking: overhaul labstore setup [tracking] - https://phabricator.wikimedia.org/T126083#2004225 (10chasemp) [15:08:12] 06Labs, 06Operations, 13Patch-For-Review, 07Tracking: overhaul labstore setup [tracking] - https://phabricator.wikimedia.org/T126083#2004225 (10chasemp) [15:08:15] 06Labs, 06Operations, 07Wikimedia-Incident: Investigate better way of deferring activation of Labs LVM volumes (and corresponding snapshots) until after system boot - https://phabricator.wikimedia.org/T121629#3029461 (10chasemp) 05Open>03declined closing in favor of T158196 [15:08:21] 06Labs, 06Operations, 13Patch-For-Review, 07Tracking: overhaul labstore setup [tracking] - https://phabricator.wikimedia.org/T126083#2004226 (10chasemp) [15:08:23] 06Labs, 10Labs-Infrastructure: Duplicate /proc/mounts entries on labstore1001 - https://phabricator.wikimedia.org/T153728#3029467 (10chasemp) 05Open>03declined closing in favor of T158196 [15:09:49] tarrow: good question, I'm not sure how we could make that self service atm. One of the labs admins could look for you faster I think. Can you make a task with what you are seeing and lookign for and we can grab relevent logs for you to review? [15:10:25] chasemp: sure! Thanks. I'll do that [15:10:54] tarrow: try to cc me on the task if you would :) [15:25:38] 06Labs, 10Tool-Labs, 07Epic: Find a solution for tools-exec-gift on Trusty - https://phabricator.wikimedia.org/T156981#3029526 (10chasemp) @Giftpflanze if we stand up a Trusty host with the same characteristics as the existing precise node are you up for migrating things? That is our current thinking. Tran... [15:26:35] 06Labs: Provide snapshot of http://tools-elastic-01.tools.eqiad.wmflabs logs - https://phabricator.wikimedia.org/T158199#3029527 (10Tarrow) [15:31:03] 06Labs: Provide snapshot of http://tools-elastic-01.tools.eqiad.wmflabs logs - https://phabricator.wikimedia.org/T158199#3029548 (10chasemp) Can you provide some details on the tool(s) in question and what we are looking for? heads up @bd808 [15:45:08] 06Labs, 06Operations, 10hardware-requests: Codfw: (1) hardware access request for labtest - https://phabricator.wikimedia.org/T154706#3029607 (10chasemp) [15:48:19] 06Labs, 06Operations, 10hardware-requests: Codfw: (1) hardware access request for labtest - https://phabricator.wikimedia.org/T154706#3029622 (10chasemp) 05stalled>03Open [15:52:24] 06Labs: Provide snapshot of http://tools-elastic-01.tools.eqiad.wmflabs logs - https://phabricator.wikimedia.org/T158199#3029647 (10Tarrow) So the tool is wikifactmine-pipeline which auths with elasticsearch as 'tools.wikifactmine-pipeline'. What it is doing is making queries to one index; parsing the response... [15:54:20] 06Labs, 06Operations, 10hardware-requests: Eqiad: (2) hardware access request for labnet1003/1004 - https://phabricator.wikimedia.org/T158204#3029672 (10chasemp) [16:09:09] chasemp: your comment on T156981 is confusing. I guess you mean: You will provide a trusty exec node with an appropriate queue? And I will just use that instead of the current one? [16:09:09] T156981: Find a solution for tools-exec-gift on Trusty - https://phabricator.wikimedia.org/T156981 [16:10:13] annika: basically, a second trusty node in teh same queue for you to migrate to [16:10:41] that would be fine [16:11:14] ok, we'll see if we can get that going annika thanks [16:17:03] 06Labs, 06Operations, 10hardware-requests: Eqiad: (2) hardware access request for labcontrol1003/1004 - https://phabricator.wikimedia.org/T158207#3029770 (10chasemp) [16:18:03] 06Labs, 10Tool-Labs, 07Epic: Find a solution for tools-exec-gift on Trusty - https://phabricator.wikimedia.org/T156981#3029773 (10Giftpflanze) > < annika> chasemp: […] I guess you mean: You will provide a trusty exec node with an appropriate queue? And I will just use that instead of the current one? > < cha... [16:21:17] samtar: https://wikitech.wikimedia.org/wiki/Help:MediaWiki-Vagrant_in_Labs [16:21:38] zhuyifei1999_: yeah saw that, but I'd need a labs project instance correct? [16:21:45] very useful alternative for debugging if you have trouble setting up vagrant locally [16:21:49] yep [16:22:21] Would debugging be a reasonable reason to get one set up for me? [16:23:23] samtar: sure [16:23:24] If it's urgent, I can lend a project to you :P [16:23:37] esp if you ask for a temporary setup even if temp is open ended-ish [16:23:59] samtar: requesting a project to test/debug an extension or service is completely legitimate :) [16:24:04] chasemp: that would be really useful... I imagine it would be pretty temporary really [16:24:28] bd808: :D the more I see here the more awesome all you lot are ^^ [16:24:43] samtar: no worries https://phabricator.wikimedia.org/T76375 [16:24:52] 06Labs, 10Labs-Infrastructure, 10DBA, 06Operations, 07User-notice: labsdb1005 (mysql) maintenance for reimage - https://phabricator.wikimedia.org/T157358#3029816 (10Marostegui) For the backup data: es1017 looks like a good candidate: ``` marostegui@es1017:~$ df -hT /srv Filesystem Type Size... [16:26:02] empowering ppl to debug on their own is our favorite thing [16:26:05] heh [16:26:30] samtar: awww thanks. We try to be awesome [16:28:56] guess you guys have found the new cloud motto: "empowering ppl to debug on their own is our favorite thing" [16:29:02] I sign off for that [16:29:38] It's a good motto indeed [16:31:37] 06Labs: Request creation of labs project - https://phabricator.wikimedia.org/T158210#3029825 (10Samtar) [16:31:54] 06Labs: Request creation of extlinkschange-testing labs project - https://phabricator.wikimedia.org/T158210#3029841 (10Samtar) [16:32:46] not seeing the section of the task title...how embarrassing :P [16:33:37] samtar: we usually review and create on mondays but we can probably knock it out sooner just fyi [16:34:34] chasemp: I'd be appreciative if you could, if only so I have this weekend to work on it :) [16:35:39] 06Labs: Request creation of extlinkschange-testing labs project - https://phabricator.wikimedia.org/T158210#3029862 (10chasemp) +1 -- poking @andrew for a second :) (user asked if we could get this done before the weekend) [16:43:38] Change on 12www.mediawiki.org a page Wikimedia Labs/Tool Labs was modified, changed by 175.141.121.64 link https://www.mediawiki.org/w/index.php?diff=2398252 edit summary: [16:44:56] Change on 12www.mediawiki.org a page Wikimedia Labs/Tool Labs was modified, changed by Samtar link https://www.mediawiki.org/w/index.php?diff=2398254 edit summary: Undo revision 2398252 by [[Special:Contributions/175.141.121.64|175.141.121.64]] ([[User talk:175.141.121.64|talk]]) [17:11:15] 06Labs, 10Tool-Labs: Puppet fails on tools-docker-builder-03 - https://phabricator.wikimedia.org/T157415#3004414 (10yuvipanda) Yup! -03 just needs to be deleted... [17:35:32] 06Labs, 10Tool-Labs-tools-Other, 06Community-Tech-Tool-Labs, 06Developer-Relations, and 3 others: Create an authoritative and well promoted catalog of Wikimedia tools - https://phabricator.wikimedia.org/T115650#3030000 (10Husky) Hey, thanks @Dereckson for pointing me towards this discussion, didn't even kn... [17:51:51] tools db has been failed over for announced maintenance just now [17:53:10] {the world burns} :D [17:55:53] technically, switched over :-) [18:26:58] Hmm. Is it intended as part of this maintenance that labsdb is entirely down? [18:31:19] Ah... It looks like "tools.labsdb" is pointing at a wrong server, while "tools-db" works. [18:31:41] 06Labs, 10Tool-Labs-tools-Other, 06Community-Tech-Tool-Labs, 06Developer-Relations, and 3 others: Create an authoritative and well promoted catalog of Wikimedia tools - https://phabricator.wikimedia.org/T115650#3030244 (10bd808) >>! In T115650#3030000, @Husky wrote: > It's just 400 tools, and history has p... [18:31:59] ^yuvipanda [18:34:29] anomie where are you seeing this? They both should be pointing to the same thing [18:34:48] DNS cache effects maybe [18:36:24] yuvipanda: I restarted a bunch of my bots' jobs because they wound up dying with "The MariaDB server is running with the --read-only option so it cannot execute this statement". The jobs exited right away with complaints about "Can't connect to MySQL server on 'tools.labsdb' (111)". I changed the config to use tools-db, and then it worked. [18:37:05] maybe you just hit the time during the failover [18:37:23] on failover, there is a small period of time in read only [18:37:38] Yeah I suspect that is just DNS cache there. tools-db is less widely used so not in cache I suspect [18:37:47] and maybe your application cached the ip [18:37:48] Let me try invalidating all.caches on grid [18:37:59] or the dns [18:38:07] The dying was at around 17:42, the restarting was a few minutes ago, FYI. [18:38:37] 61 users are currently connected, though [18:40:41] I had tried my config (~/.my.cnf) from the command line with the "mysql" command too, and got the same error. Now if I try "mysql -h tools.labsdb" it connects fine, though, so DNS caching seems likely enough. [18:45:05] !log tools clush a restart of nscd across all of tools [18:45:09] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [18:54:05] anomie: that should help (it finished a while ago). I'd prefer people continued using the tools.labsdb one :D [19:00:06] does somebody know, how to setup tunnels at royal ts? => so that the programm connects to my instance via bastion as tunnel [19:01:30] 06Labs, 06Operations: Reimage labstore1001 and labstore1002 for DRBD storage setup - https://phabricator.wikimedia.org/T158196#3029409 (10greg) (those tasks above that this task was mentioned in were all(?) in `#wikimedia-incident` as a follow-up/action item, should this one be as well?) [19:04:30] 06Labs, 06Operations: Reimage labstore1001 and labstore1002 for DRBD storage setup - https://phabricator.wikimedia.org/T158196#3030391 (10chasemp) >>! In T158196#3030375, @greg wrote: > (those tasks above that this task was mentioned in were all(?) in `#wikimedia-incident` as a follow-up/action item, should th... [19:04:45] nvw, found it [19:04:48] *nvm [19:12:15] Hi - we have https://twl-test.wmflabs.org/ running on labs and need to find somewhere for backups. Is there a common and/or sensible place to host backups? [19:13:02] the code, on git [19:13:27] This would be regarding the database [19:14:07] data, I am not sure [19:20:41] 06Labs, 15User-Hydriz: Dumps instances occasionally hammer NFS for temporary storage - https://phabricator.wikimedia.org/T134148#3030471 (10Nemo_bis) 05Open>03Resolved As far as I can see, this is resolved. I/O from NFS is quite slow from the dumps machines (less than 90 Mbps) and this is the worst effect... [19:40:43] 06Labs, 10Tool-Labs, 05Security: tool labs should filter out the Service-Worker-Allowed: header to prevent tools from setting it. - https://phabricator.wikimedia.org/T158216#3030584 (10Bawolff) 05Open>03Resolved a:03yuvipanda [19:40:47] 06Labs, 10Tool-Labs, 05Security: tool labs should filter out the Service-Worker-Allowed: header to prevent tools from setting it. - https://phabricator.wikimedia.org/T158216#3030063 (10Bawolff) [19:53:53] (03PS3) 10Jean-Frédéric: Refactor database_statistics.getStatistics [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/331407 [19:54:22] (03CR) 10Jean-Frédéric: "Answered :)" (033 comments) [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/331407 (owner: 10Jean-Frédéric) [20:17:22] 06Labs, 10Labs-Infrastructure, 10DBA, 06Operations, and 2 others: labsdb1005 (mysql) maintenance for reimage - https://phabricator.wikimedia.org/T157358#3002516 (10ops-monitoring-bot) Script wmf_auto_reimage was launched by jynus on neodymium.eqiad.wmnet for hosts: ``` ['labsdb1005.eqiad.wmnet'] ``` The lo... [20:37:44] 06Labs, 10Tool-Labs: templatetiger is using 613G in Tools out of 8T - https://phabricator.wikimedia.org/T136192#3030801 (10Kolossos) I update the db to 2017. Everything works fine. Now the database seems to be deleted/disappear. Anyone any idea? [20:42:08] 06Labs, 10Labs-Infrastructure, 10DBA, 06Operations, and 2 others: labsdb1005 (mysql) maintenance for reimage - https://phabricator.wikimedia.org/T157358#3030805 (10ops-monitoring-bot) Completed auto-reimage of hosts: ``` ['labsdb1005.eqiad.wmnet'] ``` and were **ALL** successful. [20:42:19] 06Labs, 10Tool-Labs: templatetiger is using 613G in Tools out of 8T - https://phabricator.wikimedia.org/T136192#3030806 (10yuvipanda) @Kolossos it's part of the db maintenance happening just now (announced at https://lists.wikimedia.org/pipermail/labs-announce/2017-February/000204.html). Your db was too big to... [20:45:09] 06Labs, 10Tool-Labs: templatetiger is using 613G in Tools out of 8T - https://phabricator.wikimedia.org/T136192#3030823 (10Kolossos) @yuvipanda Thank you. Fine. [21:19:30] (03PS4) 10Jean-Frédéric: Refactor database_statistics.getStatistics [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/331407 [21:24:02] 06Labs, 10Tool-Labs: Improve `webservice status` output - https://phabricator.wikimedia.org/T158244#3030943 (10Legoktm) [21:39:54] hi! have we changed anything ssl-related on tools.wmflabs.org? my mono webapp suddenly no longer works (TlsException) [21:41:30] leloiandudu is your tool using http [21:42:12] let me rephrase [21:42:27] is the code for the web pages for your tool using http [21:47:04] Zppix: I don't know what you mean. it doesn't make web requests, just responds to them [21:48:08] leloiandudu disregard I feel so dumb for asking that... I dont know where i was going with that [21:48:12] last time it was due to the fact that mono certificate store was old and didn't have one of the CA certs that tools.wmflabs.org used. but this time re-syncing the certificate store doesnt' help [21:49:01] I dont know if tools.wmflabs.org can use ssl certs on just certain tools, Krenair are you around to help? [21:53:21] 06Labs, 07Tracking: New Labs project requests (tracking) - https://phabricator.wikimedia.org/T76375#3031051 (10Andrew) [21:53:23] 06Labs: Request creation of extlinkschange-testing labs project - https://phabricator.wikimedia.org/T158210#3031048 (10Andrew) 05Open>03Resolved a:03Andrew I created the project. @Samtar, you are currently the only member, but you can add other members (or other projectadmins who can themselves add other... [21:56:10] it worked fine a day or to ago and I didn't make any changes, that's why I'm asking. mabye the certificate has been replaced recently [21:56:34] 06Labs, 10Labs-Infrastructure: Deprecate precise instances in Labs by 03/31/2017 - https://phabricator.wikimedia.org/T143349#3031062 (10Andrew) Option #1 is fine with me. Otherwise, @hashar, if you want to go with _joe_'s suggestion I'm available to help wrangle puppet for special-purpose testing nodes. [21:56:51] Zppix, no [21:58:55] ok [21:59:22] leloiandudu are you using the jobs grid or k8s [21:59:28] for webservice [22:02:14] Zppix, I'm not using anything apart from "webservice start" [22:02:57] the tool name is fountain btw [22:07:43] leloiandudu try running webservice stop and then try webservice start --backend=kubermetes [22:08:00] (03PS1) 10Jean-Frédéric: Add unittest to populate_image_table.processSource [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/338007 [22:08:02] (03PS1) 10Jean-Frédéric: Extract method normalize_identifier and add unit tests [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/338008 [22:08:04] (03PS1) 10Jean-Frédéric: Track number of tracked images (on top of found images) [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/338009 [22:18:15] (03CR) 10jerkins-bot: [V: 04-1] Extract method normalize_identifier and add unit tests [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/338008 (owner: 10Jean-Frédéric) [22:18:39] (03CR) 10jerkins-bot: [V: 04-1] Add unittest to populate_image_table.processSource [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/338007 (owner: 10Jean-Frédéric) [22:19:01] (03CR) 10jerkins-bot: [V: 04-1] Track number of tracked images (on top of found images) [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/338009 (owner: 10Jean-Frédéric) [22:24:56] Zppix, now it's 502: https://tools.wmflabs.org/fountain/ [22:25:12] do I need to change anything in lighttpd for kubernetes? [22:29:52] (03PS2) 10Jean-Frédéric: Add unittest to populate_image_table.processSource [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/338007 [22:29:55] (03PS2) 10Jean-Frédéric: Extract method normalize_identifier and add unit tests [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/338008 [22:29:57] (03PS2) 10Jean-Frédéric: Track number of tracked images (on top of found images) [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/338009 [22:30:20] (03CR) 10Jean-Frédéric: "Looks like patching pywikibot.output is tricky... oh well :)" [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/338007 (owner: 10Jean-Frédéric) [22:36:07] i'm using perlbrew, why does it wish to use system perl? http://dpaste.com/2Z5XXPD.txt [22:36:19] this 'worked' ok on precise [22:44:03] gry: /mnt/nfs/labstore-secondary-tools-project/gpy/gpy/bot.sh: 7: /mnt/nfs/labstore-secondary-tools-project/gpy/gpy/bot.sh: source: not found [22:44:25] without that source, your $PATH is probably not set up correctly [22:45:51] i just added that line a min ago; in precise it worked without it [22:47:19] my $PATH is /data/project/gpy/perl5/bin:/data/project/gpy/perl5/perlbrew/bin:/tmp/1248557.1.task:/usr/local/bin:/bin:/usr/bin and $PERL5LIB is /data/project/gpy/perl5/lib/perl5 [22:49:28] 06Labs: Check exim4 versions on labs - https://phabricator.wikimedia.org/T158134#3031393 (10Andrew) 05Open>03Resolved I don't see evidence that this is happening widely, or at all. [22:55:03] valhallasw`cloud: Failed to create a user log table to use. This table is vital for the operation of this interface. Exiting... is coming back from my tool. This is new. Is the DB down? [22:56:26] Cyberpower678, check topic [22:56:57] jynus: I can't see the topic in my client. [22:57:15] thre are a few seconds of read-only when things fail over [22:57:25] I see [22:57:33] at the start [22:57:38] and at the end [22:57:44] we are at the end of the maintenace right now [22:57:52] what irc client are you using, Cyberpower678 ? [22:58:06] gry: limechat macOS [22:58:19] and does saying '/topic' in here make the topic show up ? [22:58:22] like this [22:58:24] /topic [22:58:26] a broken one if it doesn't show topics :) [22:58:49] gry: I did not know that command. [22:58:59] I used to have a client that shows it on the top. [22:59:13] bd808: Limechat is quite nice actually. [22:59:32] It's only failing is the missing topics bar on the top. [22:59:55] *nod* I use textual which forked from limechat back in the day [23:00:02] Derp [23:00:09] I didn't mean to set the topic. [23:00:14] * Cyberpower678 didn't know he can. [23:00:21] apparently our topic lock is off again :) [23:00:41] bd808: yes it is. [23:00:48] :p [23:00:55] I like it better off personally [23:01:05] more wiki-like [23:01:09] I don't mind. I'm a trusted admin on enwiki. :p [23:01:29] I don't understand why TLS is involved at all. Does the toolserver frontend nginx communicate with backends via https? That seems like a strange idea to me [23:01:47] Well it would seem that this tools-db maintenance has busted my maintenance script on the DB. :/ [23:01:58] Cyberpower678, it can't [23:02:06] we are in read only precisely [23:02:13] to avoid writes during the transition [23:02:24] either the writes are on both servers or on none [23:02:34] try restarting your connection [23:02:50] all databases are now back in read-write [23:04:59] 06Labs, 10Tool-Labs: templatetiger is using 613G in Tools out of 8T - https://phabricator.wikimedia.org/T136192#3031420 (10yuvipanda) @Kolossos it should be back now? [23:05:38] jynus: https://tools.wmflabs.org/iabot/ shows that the script died during the failover. :/ [23:05:52] That's a real time progress indicator. [23:07:02] * Cyberpower678 restarts the script [23:07:04] why did you do maintenance? [23:07:13] during a database maintenance [23:07:21] I didn't. [23:07:25] announced more than 1 week in advance? [23:07:26] leloiandudu: no tls between the front-end proxy and your tools. All of the wikimedia wikis and their apis use TLS though. [23:07:34] This was running for the last week. [23:07:35] I can tell you, no data was lost, at most [23:07:37] Or more [23:07:51] we were in read only for some time [23:08:01] Not until today [23:08:09] yes, the day of the maintenance [23:08:23] things were in read only for a small period of time [23:08:34] read only means if you write, things will fail [23:08:44] Cyberpower678: hi. your code should generally be able to deal with db disconnects. Let me know if you need pointers on how to do that. [23:08:50] Well it broke the script. If it can't execute a write query it terminates. [23:09:05] Cyberpower678: right. your script shouldn't do that. [23:09:33] yuvipanda: DB disconnects it handles. But if it's not from a DB disconnect aka writing to a DB that won't write, it terminates. [23:09:46] I designed it that way in case of a malfunction. [23:10:27] Cyberpower678: that seems sensible! so what would you like us to do now? we announced the maint. a while ago, and we minimized downtime as much as possible. [23:11:02] yuvipanda: nothing. I wasn't aware of the maintenance window being announced. Must've slipped by my email. [23:11:07] I was just confused. [23:11:18] Cyberpower678: ah, ok! [23:11:23] As to why my entire tool failed a moment ago with an error message. [23:11:50] My tool has failsafes and one is to not execute if it can't assure proper access to the DB. [23:13:21] Yay. The log is full of Failed to INSERT IGNORE INTO externallinks_enwiki (`pageid`,`notified`,`url_id`) VALUES (40075045,0,13497424); [23:13:35] :p [23:14:00] nice [23:24:40] 06Labs, 10Labs-Infrastructure, 10DBA, 06Operations, and 2 others: labsdb1005 (mysql) maintenance for reimage - https://phabricator.wikimedia.org/T157358#3031456 (10jcrespo) This is mostly done, no major incidents- servers where only in read-only for a few seconds before and after the maintenance, for switc... [23:26:15] yuvipanda: I've restarted the script. Try not to kill the DB in the next 89 hours. :p [23:26:17] bd808, but my mono-based web tool is not able to start anymore today. I didn't make any changes to it and it worked 2 days ago. I get a TlsException during the start even if I don't access it via https. thoughts? [23:26:30] The time estimates on that tool will be off for a bit. [23:27:13] leloiandudu: does the error log tell you what mono is trying to do when it has the error? [23:28:01] PHP Warning: mysqli_connect(): (HY000/2002): Connection timed out in /mnt/nfs/labstore-secondary-tools-project/iabot/public_html/dbMaintainance.php on line 23 [23:28:04] jynus: [23:28:06] ^ [23:28:10] bd808, no. last time it was due to toolserver using a new CA certificate that mono wasn't aware of (it has a separate cert store). but this time updating the store doesn't help [23:29:28] Cyberpower678: :) I recommend fixing your script to recover gracefully :) [23:29:31] bd808 full error text: http://pastebin.com/kKZL2nQN [23:30:10] yuvipanda: that was on the script startup. It can handle a disconnect, but the script assumes it should be able to connect without issue on startup. :p [23:30:27] k [23:31:16] yuvipanda: and it keeps happening. [23:31:34] leloiandudu: and that happens without the mono application trying to make a request to any outside service? [23:33:03] Cyberpower678, you may be using tls [23:33:28] jynus: I haven't changed anything since the maintenance. [23:33:29] disable tls for now, until we give users certificates, you will not be able to use it [23:33:37] I do, I have enabled TLS [23:33:50] but your client should not try using tls [23:33:55] or it will fail [23:34:15] i says Tls.TlsException [23:34:18] do not use tls [23:34:38] Just getting a timeout. [23:34:44] No error. [23:34:58] MySQL workbench however is returning SSL connection error: unknown error numbe [23:35:16] again, do not use tls [23:35:20] jynus: the tls issues are from leloiandudu and probably not database related (although unsure) [23:35:58] jynus: And how do I disable it? [23:36:00] the new packages support tls, but clients should not try to use it [23:37:01] you enabled tls, disable it [23:37:49] 06Labs, 10Tool-Labs-tools-Other, 06Community-Tech-Tool-Labs, 06Developer-Relations, and 3 others: Create an authoritative and well promoted catalog of Wikimedia tools - https://phabricator.wikimedia.org/T115650#3031502 (10Tgr) Quite a few people voted on this task so surely they wanted something more than... [23:37:53] on the client command line, skip-ssl, which should be the default [23:38:11] on programming, the default shouls be tls [23:38:18] *non-tls connections [23:38:31] jynus: MySQL workbench. How do I disable it on there. Because I can't connect to my DB now. [23:38:34] bd808, I'm not making any requests on the startup, no [23:38:56] let me google "workbench disable tls" :-) [23:39:19] jynus: my script started up. Looks like the timeout disappeared. [23:39:19] first result: https://dev.mysql.com/doc/workbench/en/images/wb-ssl-wizard-start.png [23:39:29] change if available [23:39:30] bd808, only connecting to the db [23:39:32] for never [23:39:38] or something like that [23:40:18] leloiandudu: oh! maybe your db connection is trying to negotiate TLS too like the issue Cyberpower678 is seeing? [23:40:45] note we didn't disable non-tls connection [23:41:08] but if you try to negotiate with a week certificate, your connection will be rejected [23:42:16] bd808, maybe! [23:42:24] jynus: SSL Wizard didn't do anything. [23:42:29] some frameworks may try to be too smart and enable it [23:42:33] not the wizard [23:42:33] It just generated a certificate. [23:42:40] the Use SSL option [23:42:43] disable that [23:43:05] it must be enabled or "if available" [23:43:16] jynus: Oh I clicked the button that was boxed in the screenshot [23:43:28] no, that was missleading [23:43:41] but congratulations on your new certificate ! :-) [23:44:08] that may come handy soon [23:44:16] leloiandudu, what are you using to connect? [23:44:22] jynus: I deleted it. Still not working. SSL is disabled. [23:44:31] then it is not that [23:44:38] jynus .net (mono) [23:44:46] leloiandudu: looking at https://dev.mysql.com/doc/connector-net/en/connector-net-connection-options.html -- maybe try adding "UseSSL=false" to your connection string? [23:44:58] ^that makes sense [23:45:03] bd808, I'm trying now, thanks! [23:45:24] or "SSL Mode = None" [23:45:38] yea, one of those [23:45:46] it seem it may default to preferred [23:46:19] the docs say it should default to SSL Mode = None, but who knows really [23:46:25] yeah [23:46:50] ah, I know what it is [23:47:00] workbench had some issues with mariadb 10 [23:47:02] jynus: I can't figure how to connect. [23:47:05] can/should the server config change to not advertise TLS? [23:47:07] let me search [23:47:11] bd808, junus, Zppix: thanks! the correct option was to use 'UseSSL=None' in my db connection string [23:47:12] I need my workbench. :-( [23:47:17] bd808, it does not advertize [23:47:21] I wonder what's wrong the certificate though [23:47:24] leloiandudu: awesome [23:47:26] I have only enabled it [23:47:35] but it requires strong encrpytion [23:47:42] do we use a self-signed certificate or something like that, [23:47:46] ? [23:47:55] and sadly most clients do not support it [23:48:07] it is an internal CI, yes [23:48:28] also it requires openssl [23:48:41] regular mysql client does not support it [23:48:47] jynus: so what do I do to fix this? [23:48:52] leloiandudu: It sounds like there is support in the protocol for opportunistic encryption (which is good), but you have to set things up just right for it to work with our servers [23:49:08] Cyberpower678, there is an option to start workbench with 10 [23:49:14] let me search [23:49:16] * bd808 is filling in the blanks here with guesses [23:49:16] bd808, weird. anyway, thanks again! [23:50:05] leloiandudu: so the change is that we just today migrated traffic over to newer MariaDB servers. That's what was different today from any day before. [23:51:28] bd808: and it breaks everything. XD [23:51:29] actually, workbench complains [23:51:41] but it should work with mariadb 10 with no issues [23:52:05] I used it last week with no issue [23:52:06] I think it does. It connects to enwiki.labsdb with no issues, other than that complaining message. [23:52:11] Cyberpower678: your sarcasm/snark setting is too high. Dial it back please. [23:52:25] Cyberpower678, so what is the problem? [23:52:29] bd808: it was neither. I'm making a joke [23:52:30] if you can connect [23:52:35] what is the issue? [23:52:48] I can connect to enwiki.labsdb, but not tools-db [23:52:54] jokes are funny. [23:53:10] I am not sure tools-db is the right dns [23:53:29] tools.labsdb maybe? [23:53:40] check docs [23:53:42] Your connection attempt failed for user 's51059' from your host to server at tools-db:3306: [23:53:42] SSL connection error: unknown error number [23:53:51] ok [23:53:54] bd808: indeed hence the "XD" I attached to it. [23:53:59] then you are tring to use tls [23:54:03] you have to disable that [23:54:10] Yes. [23:54:41] SSL is set to "No" [23:54:56] yes, but the error is "SSL connection error" [23:55:02] something is bad there [23:55:59] The settings are identical to the enwiki DB connection settings except for the host. [23:56:03] bd808: one last question. in the maintanence notice they mention postgres. is it stable and available for anybody? can I switch my tool to it? I didn't know we had postgres last time I checked [23:57:09] bd808: I suppose German humor has more sarcasm in it. Germans tend to be quite sarcastic. I'm no exception. :p [23:58:13] jynus: I can't figure it out. I know it can't connect since the update to the DB was done today. [23:58:23] I was able to connect this morning. [23:58:23] leloiandudu: The postgress server is mostly for special projects that need GIS support I think. I don't actually know a lot about it. [23:58:45] bd808, ok, thanks! [23:59:10] I can connnect if I disable SSL [23:59:42] sorry, was away. [23:59:52] Well I can't.