[00:22:25] hey ebernhardson. can you give me some pointers to where I can get some information about the search traffic we have, and any other statistics related to search? (I'm following up on the conversation we had on Thursday and have to learn a bit more before I can proceed). Any other pointers to documentation on how the search is currently done would be great, too. [00:24:29] leila: hmm, i have some graphs lemme find em [00:25:21] ebernhardson: if you tell me where the data lives, I can also look into it a bit more myself (the pointers to those above are still useful.) [00:27:16] leila: well, when i have a question mostly i look at the unsampled search logs in hive. They arn't really the easiest thing to decrypt though and are fairly raw. They live in wmf_raw.cirrussearchrequestset. There is some processed data at discovery.query_clicks_daily that aggregates together webrequests and search logs to represent all main-namespace fulltext search click throughs. [00:27:47] for a general idea of whats going on though, https://grafana.wikimedia.org/dashboard/db/elasticsearch?orgId=1&panelId=26&fullscreen is probably reasonable and is a query per second graph for a few kinds of queries we serve going back a week. [00:28:28] There is also a class of query not represented there though, mobile web uses search for related pages recomendations, but those are cached for 24h in varnish so they don't show up on this graph [00:28:37] ebernhardson: do you have a code I can reuse for webrequest logs, just so I don't reinvent the whole query. :) [00:28:37] also looking, that might be queries per minute, double checking [00:29:25] lemme see if anything neet in my hive history [00:30:20] ok, thanks. [00:31:48] :S hive history cuts off at 500 lines. Here is an example against the clicks log for counting distinct queries for a few different repeated queries thresholds: https://phabricator.wikimedia.org/P5357 [00:33:58] thanks, ebernhardson. let me look at these and familiarize myself a bit more. I will need a meeting with you at some point to internalize things, I am sure, but I front-load some work on my end before that. ;) [00:34:00] here is one i found in my phab paste history that figures out what kind of radius's are being used in geo-search: https://phabricator.wikimedia.org/P4165 [00:34:41] leila: feel free to ask questions, these logs are really quite cryptic. They kinda represent more of what was easy to extract from cirrussearch rather than some well-planed easy to understand dataset [00:37:29] sure, ebernhardson. first thing: do you have a dictionary for what each field in cirrussearchrequestset is? Most of them are pretty self-explanatory I should confess. [00:40:04] leila: best i have is the schema: https://github.com/wikimedia/mediawiki-event-schemas/blob/master/avro/mediawiki/CirrusSearchRequestSet/CirrusSearchRequestSet.idl (i wonder how these didn't make it through schema generation into the hive `desc` command ...) [00:40:25] this helps, ebernhardson. [00:41:34] there is also some extra data in payload that didn't make it into the schema, because schema upgrades are a pain in the arse. [00:41:42] payload is just a map [00:43:00] got it. [00:43:13] * leila dives in the schema. [00:43:17] looks like it generally contains: host (machine serving request), queryString (original query issued by user), acceptLang (languages the user's browser claims to speak), syntax (CSV of special syntaxes used in query) [00:46:13] i suppose i'll add up front, one of the most confusing parts of that schema is the 'requests' key, which logs each individual request between cirrus and elasticsearch, of which there may be multiple depending on what kind of search the user did. It may be ignorable for the most part, but has a bunch of useful info at times [00:47:54] right, I was going to say: I'm not sure if I fully grok queryType at the moment. [00:48:48] users have many request types, typically you are interested in 'full_text' queryType [00:49:18] ebernhardson: so when the user searches for a query, which namespaces are searched? Only the one user is in searching in? [00:49:43] leila: depends on the wiki, it defaults to $wgContentNamespaces. Often this is just 0, the main namespace, but some wikis have different settings [00:50:29] I see. [00:53:43] one thing that might be useful, i have a UDF for extracting the "main" request from the requests key. it's not merged yet, but can be compiled and shipped over as necessary: https://gerrit.wikimedia.org/r/#/c/327855/ You can also use the hdfs://analytics-hadoop/user/ebernhardson/refinery-hive-0.0.39-SNAPSHOT.jar which has that [00:54:13] great. copying it and will look into it. [00:56:02] i gotta run, but email or poke me later if you have any questions [00:57:41] so, ebernhardson, is it fair to say that we have on average around 3500 queries per second? (these are intentional queries I guess, as you call them full text, the other ones are triggered by morelike which is a different nature of search) [01:22:00] PROBLEM - Puppet errors on tools-exec-1434 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [02:31:58] RECOVERY - Puppet errors on tools-exec-1434 is OK: OK: Less than 1.00% above the threshold [0.0] [02:39:02] 10Labs-project-Wikistats: Add Assamese Wikisource to Wikistats - https://phabricator.wikimedia.org/T164240#3226840 (10Dcljr) [02:52:57] PROBLEM - Puppet errors on tools-exec-1434 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [03:27:58] RECOVERY - Puppet errors on tools-exec-1434 is OK: OK: Less than 1.00% above the threshold [0.0] [03:43:07] !log striker Setup striker-puppet01.striker.eqiad.wmflabs as project puppetmaster [03:43:09] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Striker/SAL [05:01:03] leila: well, the problem with the graph that shows 3500 qps peak is it's showing it from the elastic side. We shard enwiki into 7 pieces, so that counts as 7 queries in the 3500qps. I think when i've looked before it's something like 100 qps full text web, 500 qps full text including api (mobile apps, bots, etc). Autocomplete peaks at a little more than 1k qps, that is slightly bufered by varnish which keeps a 3hour cache of autocomplet [05:02:35] to get a real count on autocomplete i would have to work up something against the webrequest table to see what varnish caching does there exactly in terms of hit rate. The final major piece in the more like queries, i havn't personally looked but dca.usse had looked into the webrequest logs before and turned up ~200M requests per day (again buffered by varnish). I'm not sure what the min/max in terms of qps is [05:15:21] Hellp, Amitie_10g here, any sysop online? [05:15:58] I see ebernhardson. I'll look into this in the morning with a fresher mind then. [05:17:47] webservice generic is not starting [05:18:32] Amitie_10g: which tool? [05:18:41] webarchivebot [05:19:15] error.log outputs stack trace from Python [05:21:39] !log tools.webarchivebot Deleted $HOME/service.manifest; webservice restart loop gone crazy [05:21:42] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.webarchivebot/SAL [05:22:18] Amitie_10g: can you try starting it again manually? [05:22:39] I'll try [05:23:44] It started [05:23:58] Who deleted service.manifest? [05:24:12] I did [05:24:17] OK [05:24:25] Thanks [05:24:53] I'll file a bug with what I saw. it was certainly a bug in the webservice command. [05:25:14] it may be fixed by a pending patch I have to make `webservice stop` work a bit better [05:25:31] I noticied that [05:26:09] Is a good idea to wai a while before re-starting the webservice? [05:26:38] Original exception was: [05:26:38] Traceback (most recent call last): [05:26:38] File "/usr/bin/webservice-runner", line 27, in [05:26:39] webservice.run(port) [05:26:39] File "/usr/lib/python2.7/dist-packages/toollabs/webservice/services/genericwebservice.py", line 18, in run [05:26:40] os.execv('/bin/sh', ['/bin/sh', '-c', self.extra_args]) [05:26:42] TypeError: execv() arg 2 must contain only strings [05:26:44] Traceback (most recent call last): [05:26:46] File "/usr/bin/webservice-runner", line 27, in [05:26:48] webservice.run(port) [05:26:50] File "/usr/lib/python2.7/dist-packages/toollabs/webservice/services/genericwebservice.py", line 18, in run [05:26:55] os.execv('/bin/sh', ['/bin/sh', '-c', self.extra_args]) [05:26:57] TypeError: execv() arg 2 must contain only strings [05:27:22] Sigyn: it was just a bad paste, not actually spam [05:27:53] clients without flood protection are not very nice to folks :/ [05:32:55] 06Labs, 10Tool-Labs: Webservice stuck in failed restart loop because of corrupt service.manifest - https://phabricator.wikimedia.org/T164245#3226990 (10bd808) [05:38:14] 06Labs, 10Tool-Labs: Webservice stuck in failed restart loop because of corrupt service.manifest - https://phabricator.wikimedia.org/T164245#3227007 (10bd808) I'm not sure what happened before @Amitie_10g came on irc to ask for help. This may be a case of `webservice stop` not cleaning up the service.manifest... [07:09:09] (03PS1) 10Giuseppe Lavagetto: Add snakeoil private cert for etcd [labs/private] - 10https://gerrit.wikimedia.org/r/351252 [07:09:28] (03CR) 10Giuseppe Lavagetto: [V: 032 C: 032] Add snakeoil private cert for etcd [labs/private] - 10https://gerrit.wikimedia.org/r/351252 (owner: 10Giuseppe Lavagetto) [07:44:43] <@bd808> Sigyn: it was just a bad paste, not actually spam [07:44:47] Sigyn is a robot [07:45:16] bd808: can't be avoided unfortunately [08:49:22] Hello! Is there a tools labs admin here? [08:49:57] It seems that some webservice instances have not the right to use jsub [08:57:08] I don't suppose they are kubernetes based webservices? If so then they can't interact with the gridengine: https://wikitech.wikimedia.org/wiki/Help:Tool_Labs/Web/Kubernetes [09:00:00] No, it's services running on the old grid [09:00:51] see https://phabricator.wikimedia.org/T117906 [09:01:57] tools-webgrid-lighttpd-1428, tools-webgrid-lighttpd-1427 and tools-webgrid-lighttpd-1417 seems affected [09:10:06] ah sorry, it was just a thought :) [09:13:14] thank you :-) [09:27:24] hi everybody -- I'm having some troubles with connecting to lab tools via ssh, which I never had before. Anyone can help? :) [09:32:26] sdesabbata: tool labs? [09:34:26] yep, login.tools.wmflabs.org [09:41:19] zhuyifei1999_ I don't quite understand as it returns no specific error [09:42:00] have you tried verbose? [09:42:11] like ssh -vvv login.tools.wmflabs.org [09:43:15] yes, at one point it says "receive packet: type 51" "Authentications that can continue: publickey,hostbased", and then goes on trying other auth methods [09:43:39] hmm, let me see my debug [09:43:56] zhuyifei1999_: thanks [09:45:19] yep 51 is normal [09:45:41] what happens after "Offering RSA public key"? [09:45:59] and "we sent a publickey packet, wait for reply" [09:49:35] zhuyifei1999_: "Offering RSA public key: /home/*me*/.ssh/id_rsa"->"send_pubkey_test"->"send packet: type 50"->"we sent a publickey packet, wait for reply"->"receive packet: type 51"->"Authentications that can continue: publickey,hostbased"->"Trying private key: /home/*me*/.ssh/id_dsa" [09:51:36] "receive packet: type 51" after "we sent a publickey packet, wait for reply" is not good [09:51:57] supposed to be type 60 [09:52:24] do you see a permission denied (public key)? [09:54:33] at the very end, "Permission denied (publickey,hostbased)." [09:54:57] yeah that's a ssh key auth fail [09:55:10] what puzzled me is that the exact same command worked a few months ago [09:55:56] you still have the same keys? (and the '/home/*me*/.ssh/id_rsa' is the correct key?) [09:56:14] (or the dsa one) [09:56:34] I also created a new key and update it on the openstack preference page [09:57:09] I wonder whether my ssh permissions have been changed [09:57:41] your permissions looks fine to me [09:57:46] $ id sdesabbata [09:57:46] uid=16260(sdesabbata) gid=500(wikidev) groups=50062(project-bastion),50380(project-tools),53284(tools.infogeo),500(wikidev) [10:00:48] maybe bd808 can help out on this key issues. I'm not very confident on my knowledge of it [10:03:53] thank you so much for checking this out [10:10:17] bd808: would you be able to double-check this ssh issue? [11:55:05] zhuyifei1999_ bd808: no worries its something in the conf of this computer, as I can access from another one [12:12:12] !log wikispeech Deploy latest from Git master: c40dbee, 690c928, 0df33d4, c4f194e, 445c3e3 (T151886) [12:12:17] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Wikispeech/SAL [12:12:17] T151886: Add info link to control panel - https://phabricator.wikimedia.org/T151886 [13:34:56] 06Labs, 06Operations, 10wikitech.wikimedia.org, 13Patch-For-Review: Update wikitech-static and develop procedures to keep it maintained - https://phabricator.wikimedia.org/T163721#3227787 (10Andrew) I ack'ed the alert yesterday and am working on it. It's composer, of course. [14:21:53] 10Labs-project-other, 06Developer-Relations, 10WikiApiary, 15User-Tgr: move WikiApiary to Labs - https://phabricator.wikimedia.org/T149874#3227872 (10Tgr) [14:28:34] 06Labs, 10MediaWiki-API: Tired of APIError: readonly - https://phabricator.wikimedia.org/T164191#3227895 (10MarcoAurelio) Thanks @Legoktm. I am not sure if there's a pywikibot option for the template.py script or others that forces the bot to retry when they trigger the readonly error. I didn't knew that maxla... [14:33:04] 06Labs, 10Labs-Infrastructure, 06Operations, 10Wikimedia-Apache-configuration, and 2 others: wikitech-static sync broken - https://phabricator.wikimedia.org/T101803#3227922 (10Andrew) 05Open>03Resolved I upgraded wikitech to 1.28.2 a few days ago and there was some composer/syntax highlighting snafu th... [14:33:51] 06Labs, 06Operations, 10wikitech.wikimedia.org, 13Patch-For-Review: Update wikitech-static and develop procedures to keep it maintained - https://phabricator.wikimedia.org/T163721#3227925 (10Andrew) The alerting system seems to be working for this. We haven't designated a specific person in charge, but ma... [14:45:10] 06Labs, 10MediaWiki-API: Tired of APIError: readonly - https://phabricator.wikimedia.org/T164191#3225024 (10zhuyifei1999) See also {T154011} for the pywikibot-side handling of the error. > apparently jstart does not work with this pywiki script Pywikibot in general support being started with jstart (and some... [14:46:13] (/me wants to yell at those who chmod their tool dirs 770) [14:47:12] 10Tool-Labs-tools-Other, 10MediaWiki-API: Tired of APIError: readonly - https://phabricator.wikimedia.org/T164191#3227984 (10zhuyifei1999) [14:50:47] 10Tool-Labs-tools-Other, 10DBA: Tired of APIError: readonly - https://phabricator.wikimedia.org/T164191#3228005 (10zhuyifei1999) [14:53:15] 10Tool-Labs-tools-Other, 10DBA: Tired of APIError: readonly - https://phabricator.wikimedia.org/T164191#3228024 (10MarcoAurelio) @zhuyifei1999 Sorry but I'd like to keep my project closed to my eyes only and the admins if they ever need to see what's going on there. However you can see in the description which... [15:15:35] 06Labs, 10wikitech.wikimedia.org: Move wikitech-static to Chicago - https://phabricator.wikimedia.org/T164271#3228092 (10Andrew) [15:17:05] 10Tool-Labs-tools-Other, 10DBA: Tired of APIError: readonly - https://phabricator.wikimedia.org/T164191#3228106 (10zhuyifei1999) The necessary questions (which I could have investigate myself if the dir were at least 004) before knowing what went wrong with jstart: 1. do you have a custom $PATH, $PYTHONPATH,... [15:22:00] Python-mwxml question, perhaps halfak can answer? [15:22:12] o/ [15:22:22] Trying to read in the enwiki-20170420-logging.xml file, but getting malformed XML error. [15:23:04] Can python-mwxml handle logging dumps or just revision dumps? [15:23:11] If it can't handle logging dumps, what Python library can? [15:25:26] It's designed to handle just the revision dumps. but let me look at the logging dumps. [15:25:33] Might be able to make support for them easily. [15:26:20] Can you link me to a logging dump? [15:26:34] nevermind! Found one. [15:26:47] halfak: K [15:27:08] Ahh. Yeah, i see this has "logitem" in it. I wonder if that is in the schema. [15:27:59] * halfak looks at https://www.mediawiki.org/xml/export-0.10.xsd [15:28:04] Looks like it is in there. [15:28:04] 06Labs, 07Tracking: New Labs project requests (tracking) - https://phabricator.wikimedia.org/T76375#3228126 (10chasemp) [15:28:07] 06Labs, 10Labs-Infrastructure, 07artificial-intelligence: Provide large disk space to WikiBrain for memory-mapped file - https://phabricator.wikimedia.org/T161554#3228125 (10chasemp) 05Open>03stalled [15:28:50] (03CR) 10Niedzielski: [V: 032 C: 032] Ensure that the Jenkins build timestamp is interpreted as UTC [labs/tools/wikipedia-android-builds] - 10https://gerrit.wikimedia.org/r/351200 (owner: 10Mholloway) [15:29:10] Halfak: Yeah, it is. [15:29:27] Halfak: How easy is it to implement in mwxml? [15:29:45] Halfak: Support for the logging dumps I mean [15:30:06] 10Tool-Labs-tools-Other, 10DBA: Tired of APIError: readonly - https://phabricator.wikimedia.org/T164191#3228129 (10jcrespo) You can see here a summary of the lag of s7, which is where eswiki is hosted (it is fully public): https://grafana.wikimedia.org/dashboard/db/mysql-replication-lag?panelId=7&fullscreen&or... [15:30:07] It should work OK. I'll need to do some work to the library. OK if I have something for you to test in a few hours? [15:30:12] codeofdusk, ^ [15:31:01] Halfak: Yep that's fine. What's your preferred way of contacting me? Email? Wiki talk message? [15:31:16] Are you codeofdusk on-wiki? [15:31:24] (otherwise, I'm in this channel all day) [15:31:33] Halfak: Yep. [15:31:54] OK cool. [15:32:05] Halfak: On Wikitech, enwiki, enwikt and eswikt. [15:32:24] Halfak: Any wiki is fine. [15:32:34] Halfak: Thanks in advance for your changes :) [15:48:50] 06Labs, 10Tool-Labs: Suggesting addition to tool labs documentation regarding tools that delibrately chmod their tool home dirs o-rx - https://phabricator.wikimedia.org/T164272#3228151 (10zhuyifei1999) [15:49:47] 06Labs, 10Tool-Labs: Suggesting addition to tool labs documentation regarding tools that delibrately chmod their tool home dirs o-rx - https://phabricator.wikimedia.org/T164272#3228167 (10zhuyifei1999) [15:53:27] 06Labs, 10Labs-Infrastructure: Try to make a Stretch base image for Labs - https://phabricator.wikimedia.org/T164273#3228171 (10Andrew) [15:59:06] 10Tool-Labs-tools-Other, 10DBA: Tired of APIError: readonly - https://phabricator.wikimedia.org/T164191#3228192 (10MarcoAurelio) The task is already finished with the template replaced so I'm done here. Thank you to everyone who have helped. Maybe T154011 is the solution for this, I don't know. However and in... [16:13:28] 10Tool-Labs-tools-Other, 10DBA: Tired of APIError: readonly - https://phabricator.wikimedia.org/T164191#3228236 (10zhuyifei1999) https://wikitech.wikimedia.org/wiki/Help:Tool_Labs/Developing#Using_the_shared_Pywikibot_files_.28recommended_setup.29 is a $PYTHONPATH based setup. AFAIK the griddoes not import env... [16:17:33] 06Labs, 10Tool-Labs, 07Documentation: Suggesting addition to tool labs documentation regarding tools that delibrately chmod their tool home dirs o-rx - https://phabricator.wikimedia.org/T164272#3228253 (10bd808) p:05Triage>03Normal This sounds like a good topic for the #tool-labs-standards-committee to l... [16:23:04] 10Tool-Labs-tools-Other: svgtranslate tool-Not Working - https://phabricator.wikimedia.org/T164275#3228265 (10Mikey641) [16:35:04] 06Labs, 10Tool-Labs: jsub/jstart inconsistency: non-continuous jobs spawns a login bash shell that loads .bash_profile, but continuous jobs doesn't load either .bash_profile or .bashrc - https://phabricator.wikimedia.org/T164277#3228347 (10zhuyifei1999) [16:37:56] 06Labs, 10Tool-Labs, 07Documentation: Suggesting addition to tool labs documentation regarding tools that delibrately chmod their tool home dirs o-rx - https://phabricator.wikimedia.org/T164272#3228379 (10MarcoAurelio) I think users should be free to decide what to do with their directories as long as they d... [16:41:09] (03CR) 10Mholloway: "Thanks for the +2! I've git pulled and re-confirmed the fix." [labs/tools/wikipedia-android-builds] - 10https://gerrit.wikimedia.org/r/351200 (owner: 10Mholloway) [16:54:06] 06Labs, 10Tool-Labs: jsub/jstart inconsistency: non-continuous jobs spawns a login bash shell that loads .bash_profile, but continuous jobs doesn't load either .bash_profile or .bashrc - https://phabricator.wikimedia.org/T164277#3228408 (10zhuyifei1999) ``` tools.zhuyifei1999-test@tools-bastion-02:~$ cat > ps-... [16:56:54] * halfak gets to look at XML over his lunch break :) [16:58:04] halfak: welcome to 2001! [16:58:18] :D [16:58:37] It's not so bad now that I've gotten good at hiding XML behind nice interfaces :D [16:59:16] * bd808 still has SOAP-RPC flashbacks on occasion [16:59:25] * halfak barfs [16:59:28] forgot that existed [17:00:41] the various "oh I know, let's use SOAP" sub-protocols were pretty gross. Some of them like SAML live on today. [17:03:17] 06Labs, 10Tool-Labs: jsub/jstart inconsistency: non-continuous jobs spawns a login bash shell that loads .bash_profile, but continuous jobs doesn't load either .bash_profile or .bashrc - https://phabricator.wikimedia.org/T164277#3228454 (10zhuyifei1999) ``` tools.zhuyifei1999-test@tools-bastion-02:~$ cat > env... [17:12:55] 10Tool-Labs-tools-Other, 10DBA: Tired of APIError: readonly - https://phabricator.wikimedia.org/T164191#3228499 (10zhuyifei1999) @MarcoAurelio while T164277 is not fixed, a workaround is to set the environment within the script you are running (in this case `/mnt/nfs/labstore-secondary-tools-project/mabot/.pyw... [17:18:14] wikibugs: fix wikibugs___ [17:41:11] 10Tool-Labs-tools-Pageviews: Change "last year" to actual named year - https://phabricator.wikimedia.org/T164284#3228600 (10Bluerasberry) [17:55:15] 06Labs, 10Tool-Labs: jsub/jstart inconsistency: non-continuous jobs spawns a login bash shell that loads .bash_profile, but continuous jobs doesn't load either .bash_profile or .bashrc - https://phabricator.wikimedia.org/T164277#3228660 (10zhuyifei1999) `qsub(1)` contains this: ``` -noshell... [17:56:57] bd808: do you if jstart should load a login shell or not? loading one kind of make more sense to me [18:45:34] 06Labs, 10wikitech.wikimedia.org: Set up external DNS record for wikitech-static - https://phabricator.wikimedia.org/T164290#3228802 (10Andrew) [18:49:14] zhuyifei1999_: jstart is just a shortcut for `jsub -once -continuous`. Generally I don't think that jsub should always (or even optionally) invoke a login shell. The user should write their job to do that if it is needed for some reason. [18:50:09] its easiest for me to think of jsub/qsub the same way as a cron task. It's up to me to make sure everything in the exec environment is setup how I want it [18:50:41] login shells typically do a lot of stuff that is not needed/wanted in a quick job [18:50:52] Good evening. How can I see which ssh keys are active for me on wikitech? [18:50:57] yeah. the thing is that qsub uses a login shell to parse the command line args afaict [18:51:05] I know how to add one, but how to list the current ones? :P [18:51:23] multichill: they should be in https://wikitech.wikimedia.org/wiki/Special:Preferences#mw-prefsection-openstack [18:51:41] Right :-) [18:52:18] also on https://toolsadmin.wikimedia.org/profile/settings/ssh-keys [18:52:45] and in a pinch you can query them from LDAP if you are already in a Labs instance [18:53:38] you can do that with `ldapsearch -xLLL uid=multichill` [18:54:30] Great. Looks like I need to do a bit of house keeping here [18:55:11] the UI in toolsadmin will let you add more exotic keys than wikitech will [18:56:17] it supports validating ed25519 and ecdsa keys [18:57:02] zhuyifei1999_: "qsub uses a login shell" -- if you invoke qsub from a login shell and don't quote the args, then yes [18:57:24] are you really wanting bigbrother to use your login shell env when restarting something? [18:57:37] see https://phabricator.wikimedia.org/T164277 [18:58:30] bd808: Looking at the bastion sshd config because I'm considering a similar setup somewhere else and I noticed "PermitRootLogin yes" why would you have that on a bastion host? [18:58:52] zhuyifei1999_: ah. ok so it's `jsub -continuous` that is a bit goofy [18:59:06] both the same [18:59:40] neither jsub -continuous or jstart invoke a login shell [18:59:42] multichill: because us roots sometimes ssh directly to the bastions? [18:59:51] but jsub invoke one [19:00:08] zhuyifei1999_: *nod* [19:00:34] I really dislike the -continuious magic that coren invented :/ [19:01:18] that arg parsing makes stuffs like `jsub 'do a && do b'` possible [19:01:56] I dislike ^ as well :P [19:02:00] so does an intermediate shell script which is honestly a more sane construct for a continuious job [19:02:42] yeah, but the intermediate shell script is not bash -l [19:03:00] so non-login non-interactive shell [19:03:03] so source your .profile? [19:03:11] (no bashrc) [19:03:26] hmm [19:03:56] it's .bash_profile from what I read from the man pages [19:04:08] cron would behave the same way. expecting the interactive shell environment in a job seems goofy [19:04:09] the test for .profile didn't run [19:05:04] bash will run a variety of scripts when invoked as a login shell starting with /etc/profile I think. For there it gets a bit system/account specific [19:05:39] .bash_profile is a common bash specific shell startup script, yes [19:07:01] multichill: a longer explanation is that the LDAP ssh magic can get messed up and when it does we can still login using root keys that are provisioned directly on the hosts via puppet. [19:07:43] we don't have consoles (or at least easy to get to consoles) for the VMs to allow for rescue when ssh is broken [19:08:06] Last resort is quite useful to have [19:09:06] I've never thought PermitRootLogin was bad as long as password auth is disabled [19:12:51] zhuyifei1999_: Do you have time to poke around in the qsub docs and see if there is a way for us to force the normal login shell behavior when sending in a script via stdin? If there isn't I suppose we can modify the injected wrapper script to source the user's environment scripts [19:13:14] * bd808 still doesn't like -continuous [19:13:40] you mean a login non-interactive shell? [19:14:25] zhuyifei1999_: the opposite of -noshell, whatever that happens to be [19:14:50] -noshell is for non-stdin ones [19:15:10] I mean command line args [19:15:30] it probably does an exec directly with the args [19:16:40] this is what happens with -continuous -- https://github.com/wikimedia/labs-toollabs/blob/master/jobutils/bin/jsub#L638-L652 [19:16:58] yeah ik [19:17:03] its gross [19:17:07] (was reading the code) [19:17:38] 06Labs, 10Beta-Cluster-Infrastructure, 10media-storage: Rebalance deployment-ms-be01 and deployment-ms-be02 so they run on different labvirt - https://phabricator.wikimedia.org/T161083#3229015 (10hashar) T162247 is creating new instances deployment-ms-be03 and deployment-ms-be04. Though I have no way to find... [19:18:03] also https://sourceforge.net/p/gridscheduler/code/HEAD/tree/trunk/source/daemons/shepherd/builtin_starter.c#l1586 [19:18:27] oops idk why I pointed line 1586 [19:18:45] 06Labs, 10Beta-Cluster-Infrastructure, 10media-storage: Confirm deployment-ms-be03 and deployment-ms-be04 so they run on different labvirt - https://phabricator.wikimedia.org/T161083#3229029 (10hashar) [19:19:31] we have binary = true so it's more likely https://sourceforge.net/p/gridscheduler/code/HEAD/tree/trunk/source/daemons/shepherd/builtin_starter.c#l1509 [19:23:32] continuous should not be setting the binary flag automatically [19:23:52] that's in the if/else at line 572 in jsub [19:24:16] 06Labs, 10Labs-Infrastructure: Investigate instances with high "steal" CPU - https://phabricator.wikimedia.org/T161118#3229055 (10hashar) [19:24:42] so we are doing a non-binary invocation with the script to exec passed via stdin [20:15:31] (03PS1) 10BryanDavis: Fix URL for javascript notification api check [labs/striker] - 10https://gerrit.wikimedia.org/r/351381 [20:19:05] 06Labs, 06Operations, 13Patch-For-Review: rebuild tools-grid-master as a large instance - https://phabricator.wikimedia.org/T162955#3229197 (10madhuvishy) a:03madhuvishy [20:26:54] (03CR) 10BryanDavis: [C: 032] Fix URL for javascript notification api check [labs/striker] - 10https://gerrit.wikimedia.org/r/351381 (owner: 10BryanDavis) [20:29:05] (03Merged) 10jenkins-bot: Fix URL for javascript notification api check [labs/striker] - 10https://gerrit.wikimedia.org/r/351381 (owner: 10BryanDavis) [20:31:08] (03PS1) 10BryanDavis: Fix URL for javascript notification api check [labs/striker/staticfiles] - 10https://gerrit.wikimedia.org/r/351387 [20:32:20] (03CR) 10BryanDavis: [C: 032] Fix URL for javascript notification api check [labs/striker/staticfiles] - 10https://gerrit.wikimedia.org/r/351387 (owner: 10BryanDavis) [20:32:26] (03Merged) 10jenkins-bot: Fix URL for javascript notification api check [labs/striker/staticfiles] - 10https://gerrit.wikimedia.org/r/351387 (owner: 10BryanDavis) [20:34:21] (03PS3) 10BryanDavis: Implement Tool Labs membership application and processing [labs/striker/deploy] - 10https://gerrit.wikimedia.org/r/351175 (https://phabricator.wikimedia.org/T162508) [20:36:43] 06Labs, 06Operations, 10wikitech.wikimedia.org, 13Patch-For-Review: Update wikitech-static and develop procedures to keep it maintained - https://phabricator.wikimedia.org/T163721#3229240 (10Bawolff) [20:37:04] 06Labs, 05Security: Generate labsdb views for dtywiki, pawikisource, ptwikimedia, wbwikimedia - https://phabricator.wikimedia.org/T164103#3229241 (10dpatrick) No qualms from us. Please proceed at your convenience. [20:53:17] 10Tool-Labs-tools-Pageviews: Change "last year" to actual named year - https://phabricator.wikimedia.org/T164284#3229295 (10MusikAnimal) 05Open>03Invalid @Bluerasberry The range options are intentionally relative, so you can always link to the past year, past week, or `latest-20` (most recent 20 days), etc.... [21:09:58] PROBLEM - Puppet errors on tools-exec-1438 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [21:20:08] 06Labs, 10Labs-Infrastructure, 10Continuous-Integration-Infrastructure: Labs instance ci-jessie-wikimedia-631424 can not be deleted - https://phabricator.wikimedia.org/T164305#3229375 (10hashar) [21:50:00] RECOVERY - Puppet errors on tools-exec-1438 is OK: OK: Less than 1.00% above the threshold [0.0] [21:52:22] 06Labs, 06Operations, 10wikitech.wikimedia.org, 13Patch-For-Review: Update wikitech-static and develop procedures to keep it maintained - https://phabricator.wikimedia.org/T163721#3229493 (10Dzahn) >>! In T163721#3227925, @Andrew wrote: > We haven't designated a specific person in charge, but maybe this ca... [22:05:35] 10Tool-Labs-tools-Xtools, 03Community-Tech-Sprint: Optimize edit count queries in XTools - https://phabricator.wikimedia.org/T163284#3229530 (10kaldari) It looks like the old XTools only did the expensive revision query for the single requested wiki, but used the `user_editcount` numbers for the top 10 other w... [22:08:26] are Beta Cluster questions on-topic here? [22:10:55] 10Tool-Labs-tools-Xtools, 03Community-Tech-Sprint: Optimize edit count queries in XTools - https://phabricator.wikimedia.org/T163284#3192230 (10jcrespo) revision and revision_userindex are not real tables, they are views of the same table (revision), the discrepancy of edits is just the time between one query... [22:12:22] hare: typically -releng is the better channel since it's their domain [22:54:25] 06Labs, 10Tool-Labs, 06Project-Admins: Migrate Tools access request process to Phabricator - https://phabricator.wikimedia.org/T72625#3229850 (10bd808) The code needed for {T162508} has been implemented. The workflow can be tested on https://striker.wmflabs.org/ or in a local MediaWiki-Vagrant VM using the `... [22:55:30] 06Labs, 10wikitech.wikimedia.org, 05Goal: Get rid of SemanticMediaWiki/SRF/SF from wikitech.wikimedia.org - https://phabricator.wikimedia.org/T53642#3229853 (10bd808) [23:30:36] 06Labs, 10Tool-Labs: Add HHVM backend for webservice - https://phabricator.wikimedia.org/T164161#3230011 (10Amitie_10g) My request at GitHub has been answered: The HHVM developers don't have plans to add support for more than one index file. However,pull requests are welcome. [23:35:20] 06Labs, 10Cassandra, 06Services (blocked): Request increased quota for services-testbed labs project - https://phabricator.wikimedia.org/T163375#3195307 (10bd808) @Eevans My applogies for missing this request until now. Are you still in need of this increase or did you manage to find another place to examine... [23:37:46] 06Labs, 10Cassandra, 06Services (blocked): Request increased quota for services-testbed labs project - https://phabricator.wikimedia.org/T163375#3195307 (10mobrovac) >>! In T163375#3230029, @bd808 wrote: > @Eevans My applogies for missing this request until now. Are you still in need of this increase or did... [23:44:14] 10Tool-Labs-tools-Xtools, 06Community-Tech: If revisions are revdel'd, articleinfo compares the surrounding edits as if it were one edit - https://phabricator.wikimedia.org/T148857#3230057 (10DannyH) p:05Triage>03Normal [23:44:22] 10Tool-Labs-tools-Xtools, 06Community-Tech, 07I18n: Intuition:X'stools-time ago/ksh i18n issue - https://phabricator.wikimedia.org/T125296#3230058 (10kaldari) [23:50:25] 10Tool-Labs-tools-Xtools, 06Community-Tech, 07I18n: Intuition:X'stools-time ago/ksh i18n issue - https://phabricator.wikimedia.org/T125296#3230083 (10kaldari) [23:50:29] 10Tool-Labs-tools-Xtools, 06Community-Tech, 07I18n: XTools: Cleanup i18n messages - https://phabricator.wikimedia.org/T163468#3198596 (10kaldari) [23:51:29] 10Tool-Labs-tools-Xtools, 06Community-Tech, 07I18n: XTools: Cleanup i18n messages - https://phabricator.wikimedia.org/T163468#3230087 (10DannyH) p:05Triage>03Normal [23:55:00] !log services-testbed Increased RAM quota to 16384 (T163375) [23:55:03] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Services-testbed/SAL [23:55:03] T163375: Request increased quota for services-testbed labs project - https://phabricator.wikimedia.org/T163375 [23:59:13] 10Tool-Labs-tools-Xtools, 06Community-Tech: Update docs to remove reference to running `sql/meta.sql` - https://phabricator.wikimedia.org/T164127#3230102 (10kaldari) [23:59:32] 10Tool-Labs-tools-Xtools, 06Community-Tech: Update docs to remove reference to running `sql/meta.sql` - https://phabricator.wikimedia.org/T164127#3222737 (10kaldari) p:05Triage>03Normal [23:59:49] 10Tool-Labs-tools-Xtools, 06Community-Tech: Update docs to remove reference to running sql/meta.sql - https://phabricator.wikimedia.org/T164127#3222737 (10kaldari)