[04:00:46] my beard has two colors! I think. [04:00:54] we could start a whole mutant academy in here [13:43:11] morning ottomata :) [14:00:13] morning! [14:28:40] morning halfak :). How goes? [14:28:52] Morning Ironholds [14:29:18] OK. I had to pass off the dog on a local buddy today since I needed to come into GroupLens and Jenny's out of town. [14:29:31] I think I found a broken stoplight. [14:29:41] Waited at it for 5 minutes before just driving through the red. [14:30:01] hah! [14:30:36] I may have abused my steering wheel and scared the dog while I was waiting. [14:30:42] aw [14:30:48] I, uh. Am doing a Hack/Reduce talk, apparently. [14:30:57] What's that? [14:32:22] hackerspace in Boston; I'm gonna do two talks, one on my new pending One Distributions Package To Rule Them All and one on the webtools/urltools cluster of libraries [14:36:15] What is this "One Package" business? [14:36:29] oh, have I not given you the pitch? [14:37:17] I think you'll like it! LMK when you have a free 10 minutes and I'll throw up a hangout and pitch it [14:37:33] it involves user-friendliness, speed, convenience and statistical education [14:41:37] OK. Should probably wait until tomorrow when taking calls is easier. But I'm looking forward to it. :) [14:41:50] totally :) [15:30:41] halfak, ack. What was that function you wanted run over the data I ran the logistic regression on? [15:30:43] brain fail [15:31:05] cor() [15:31:23] ta! [15:40:49] hey-mo :) [16:20:18] J-Mo, you should add hey-mo to your list of ping words. [16:25:24] only if leila adds heyla ;p [16:28:55] okay hollaver! [16:29:17] and hellofak? [16:51:32] J-Mo, hollaver? I like it [16:53:37] Ironholds: Do you have time for a quick pm? [16:53:56] sure! [17:11:58] halfak: 3000+ total queries on quarry! a [17:12:04] and we never announced on wiki-research-l :P [17:12:13] halfak: WMFR’s april fool’s email linked to quarry too :D [17:16:54] 10Quarry: Quarry sorts by the first column by default despite and ORDER BY clause - https://phabricator.wikimedia.org/T95369#1195015 (10Amire80) [17:16:58] so I've decided my next Good Article: [17:17:03] the Dartmouth Conference on Artificial Intelligence [17:17:05] * Ironholds nods firmly [17:18:43] 10Quarry: Quarry sorts by the first column by default despite and ORDER BY clause - https://phabricator.wikimedia.org/T95369#1187983 (10Amire80) [17:18:47] 10Quarry: Quarry does not respect ORDER BY sort order in result set - https://phabricator.wikimedia.org/T87829#1195024 (10Amire80) [17:22:33] 10Quarry: it would be useful to run the same Quarry query conveniently in several database - https://phabricator.wikimedia.org/T95582#1195035 (10Amire80) 3NEW [17:27:06] o/ leila missed you at research group. [17:27:20] I see that you didn't want to come because the agenda wasn't in place last night. [17:27:35] How much advance do you think is necessary? [17:28:56] Hi halfak. Given that the meeting is relatively early in the morning, if there is no agenda set by 6pm the day before (~ EOD in PT) I tend to decline and book my Thursday morning for focus [17:33:13] OK. I'll keep that in mind next time. [17:33:40] halfak: usually the problem for me is that unless there are few items in the agenda, the meeting will turn into pointer-like discussions which stay very high level. [17:56:53] leila|away, not sure what "pointer-like" means or why high-level discussions are something we ought to avoid. [17:57:41] But FWIW, 11AM isn't very early in the morning for me, so adding things to the agenda at 8AM made sense from a non-SF-focused perspective. But I appreciate your needs so I'll aim to get my items on the agenda 24h in advance from here forward. [17:58:38] cromulent graphic of the day: [17:58:39] https://commons.wikimedia.org/wiki/File:Labels_without_sites_point.svg [17:58:58] this is: for all the labels we have on WD items without sitelinks in the same language, what languages are they labels for? [17:58:58] What's a label? [17:59:08] label == statement? [17:59:13] For https://www.wikidata.org/wiki/Q666, number of the beast is the label [17:59:15] so, the "title" [17:59:16] * halfak is not that great at WD terms [17:59:37] neither is WD [17:59:51] ha [18:00:00] but basically, if we want to build placeholder articles in european languages we already have major coverage in, great [18:00:02] otherwise, ehhhhhhhhh [18:01:55] Ironholds: one thing i found interesting is that i could only write labels for four or so languages [18:02:03] english, spanish, chinese, and a fourth one [18:02:11] yep [18:02:19] because Wikidata's language selection is...not good. [18:02:25] like, it's a range based on your language settings [18:02:33] after that you are literally prohibited from writing labels [18:02:39] you have to change your site language to do so [18:02:53] i don't know very many languages but in some cases i know the name of a concept in some other random language and I want to be able to define the label that way [18:03:44] there was a wikiproject that had a russian version and i wanted to set the russian label to just be that page name. simple copy/paste operation [18:05:30] guillom, thoughts on https://meta.wikimedia.org/wiki/User:Guillaume_%28WMF%29/Wikidata_all_the_things#RQ1:_Missing_labels.2Fmissing_articles ? [18:05:50] halfak: I didn't say "we ought to avoid" high level discussions. :-) [18:08:22] Ironholds: looks great! [18:08:49] cool! [18:52:31] leila, I see. Just that you'd rather avoid them? [18:53:42] Ironholds, I'm looking for the GLM output from that logistic model. [18:53:52] Did you send it in an email or is there a gist up somewhere? [18:55:58] the coefficients orr? [19:01:47] halfak, https://github.com/Ironholds/MobileUserInterfaces/blob/master/Paper/Datasets/revert_glm_modelling.tsv for the high-level summary [19:01:53] (the full model is something like 6gb) [19:17:40] Ironholds, this looks OK. There's a strong correlation between type and user_type (which makes sense) but the coefs have the same sign, so it looks OK. [19:17:48] Can you run a variance inflation factor? [19:17:49] http://scg.sdsu.edu/logit_r/ [19:17:52] ^ see that [19:18:09] "VIF" [19:18:28] halfak, sure! [19:18:59] so, a VIF against the cor() matrix or against the LM model? [19:19:22] VIF against the GLM [19:20:39] Rule of thumb for VIF is 5 [19:20:51] *thumbs up* [19:20:53] goodbye processor [19:20:53] If you see a VIF of 5 or higher, you have a problem and you can fix it by removing predictors. [19:22:02] Your model fitness should not suffer for removing one of two highly correlated predictors. [19:23:24] *thumbs up* [19:23:31] I will try to do it tomorrow; WD stuff and a presentation to write :/ [20:31:50] and now I have non-GroupLens UMN people following me around [20:31:51] whee [21:39:16] guillom, ~~~ I am doing the research ~~~ [21:39:22] we have all the label/description interplays done [21:39:26] and now for references and images [21:55:35] \o/ [22:00:29] leila: room changed for backlog grooming, moved to Elder [22:01:37] I'm in Elder. where are you, DarTar? [22:01:39] ;-) [22:01:54] oh god [22:01:56] WIKIDATA, I HATE YOU [22:02:01] guillom, guess how images are stored? [22:02:10] they're just strings. Just strings! Strings containing the filename! [22:03:55] halfak, ellery, groomers! [22:08:53] Ironholds: Alright, I'll bite: How else could they be stored? [22:09:13] there could be an image_link datatype? [22:09:31] I would love to see how we're even going to include images without them being identifiable as images [22:09:37] running a regular expression over each value isn't gonna cut it [22:10:02] Oh I see. Sorry, still not very familiar with Wikidata lingo. [22:10:12] ah, gotcha [22:10:27] so, example: https://www.wikidata.org/wiki/Q64 "Wikivoyage banner" = "Berlin banner.jpg" [22:10:32] in the API, "Berlin banner.jpg" is a string [22:10:43] so is "http://www.berlin.de/" and "Walther Schreiber" [22:10:52] those are all, minus looking for final endings, the same class of thing [22:11:22] Ironholds: But we know that "Wikivoyage banner" is a file, right? [22:11:38] we do! If we build a list of all the properties that could be files [22:11:40] and maintain that list [22:11:44] and build it into our selection [22:11:49] |_| [22:12:02] because, see [22:12:05] the property page, https://www.wikidata.org/wiki/Property:P948 ? [22:12:12] ...that doesn't indicate those are images in a machine-readable wa [22:12:12] y [22:12:31] It says"Data type: Commons media file" [22:15:46] Ironholds: Are you saying the data type isn't machine-readable? [22:15:57] Sorry if I'm being thick. [22:16:05] it does? [22:16:08] not in my API dump :/ [22:16:12] Well, the page does. [22:16:44] oh! The property does! [22:16:48] okay, this sounds more viable then. Thanks! [22:16:52] :) [22:25:10] * Ironholds grumbles, builds list by hand [22:25:54] hey halfak. Are you the one who wanted a Phabricator version of Special:Contributions this morning? [22:26:01] https://phabricator.wikimedia.org/p/halfak/feed/ [22:26:03] Yes :) [22:26:06] <3 [22:26:15] Turns out there's one :) [22:26:15] Perfect! [22:37:15] Deskana, just encountered a great illustration of the need for a WD importance indicator [22:37:56] an item has multiple properties that consist of images. You want to display something pretty in the [article placeholder/search results]. Pick one! Err. Which one. *segfault* [22:39:02] https://www.wikidata.org/wiki/Property_talk:P18 [22:39:23] " if available, use more specific properties (sample: coat of arms image, locator map, flag image, signature image, logo image)" [22:39:31] is the problem :( [22:40:57] Ironholds: It's almost like it's not optimised for actually being used! [22:41:31] That's not a real constraint, just a way to apply the constraint. https://www.wikidata.org/wiki/Property_talk:P18#Database_or_datahell.3F [22:42:20] Deskana, yep ;p [22:42:29] Nemo_bis, augh [22:43:28] What the… [22:43:56] Only 15k violations now, so it's working https://www.wikidata.org/wiki/Wikidata:Database_reports/Constraint_violations/P18#Single_value [22:46:05] * Ironholds whimpers [22:46:12] I'm only three research questions through [22:46:19] and I want to write a report that consists of the following: [22:46:30] "Research question: Can Wikidata be used in Engineering products?" [22:46:33] "Abstract: No" [22:46:44] "Results: Heeeeeeeeell no." [22:46:50] "Conclusion: No" [22:47:07] it's just too brittle and inchoate at the moment [22:47:33] TIL: inchoate. [22:47:49] four years as a law student and I get to use "inchoate" and "oblique intent" [22:47:57] other stuff too but those are the main gains [22:53:19] Data gets better only when it's used [22:54:56] Does anyone remember the trick to use so that Labs webservices restart automatically? Like calling webservice through another command iirc. YuviPanda? [22:58:29] Ah, found it. "bigbrother". [23:14:28] guillom: starting yesterday you dont need anything - if you do a webservice start or whatever it should be completely self sustaining [23:14:36] Should be announced today [23:14:51] YuviPanda: \o/ Awesome!