[00:38:09] We should break it and error it for now, so that AW users are not encouraged by seeing any output. I've added a test to show how bad the "bandaid" could be (sorry "Denny"): Z33383. I will disconnect both implementations. : https://tools-static.wmflabs.org/bridgebot/54d73387/file_79354.jpg [00:39:39] Done: Z26955 (re @Al: First, it can change its name, because I think we are agreed now that article selection or omission is not decidable in a langua...) [00:42:36] This is part of the reason we shouldn't have AI-generated tools running around and creating a bunch of mass-produced articles. We're not at the stage where mass-production of content is helpful, and perhaps we're at a stage where it's actively harmful (re @wmtelegram_bot: What should Z26955 (SPO sentence, S without and O with article) do? It is now used [00:42:36] in many article [00:42:37] s created by Abstra...) [00:47:33] IMO using a QID (not a PID) for the verb will eventually greatly expand our flexibility. So I'm fine with dreaming that big. Regarding the article, the user who calls the function will eventually want to specify some kind of grammatical number of both the subject and object (here hardcoded ?singular and singular), and some kind of intended tense (here some kind of [00:47:33] present). So I' [00:47:34] m going to change the English label and description to reflect these choices. Feel free to revert or discuss. (re @Al: First, it can change its name, because I think we are agreed now that article selection or omission is not decidable in a langua...) [00:48:37] "AWE" is *very* good at creating many articles very fast, but perhaps due to essentialism it is fundamentally incapable of creating innovative articles, and we need innovative quality long articles more. (re @Feeglgeef: This is part of the reason we shouldn't have AI-generated tools running around and creating a bunch of mass-produced articles. W...) [00:50:57] Many on enwiki have rightfully pointed out that we have no good articles right now, thus making the project unviable and a waste of time and money. Though I strongly disagree with their conclusion, the premise, that we have no good articles right now, is definitely true, and multiplying the number of poor articles we have won't fix anything. [01:08:36] I've proposed a new item of best practice: [[Wikifunctions_talk:Best_practices#output_inconsistent_with_the_Universal_Code_of_Conduct]] (re @u99of9: We should break it and error it for now, so that AW users are not encouraged by seeing any output. I've added a test to show how...) [01:09:44] Can you link to the discussion. I may not have the will to read it, but it would help to know where if I do. (re @Feeglgeef: Many on enwiki have rightfully pointed out that we have no good articles right now, thus making the project unviable and a waste...) [01:11:10] It's on [[w:Village pump (WMF)]], you should be able to find the section in the TOC (re @u99of9: Can you link to the discussion? I may not have the will to read it, but it would help to know where if I do.) [01:11:53] /delete@wikilinksbot [01:12:28] [[w:WP:Village pump (WMF)]] [01:20:40] I've commented there but I'll add here that the Universal Code of Conduct should already be binding, no? (re @u99of9: I've proposed a new item of best practice: [[Wikifunctions_talk:Best_practices#output_inconsistent_with_the_Universal_Code_of_Co...) [01:23:22] Yes, but in this instance binding on who and what do we do? Was I wrong to make the (valid non-abusive) test? Was the implementation abusive before I made the test? (re @Feeglgeef: I've commented there but I'll add here that the Universal Code of Conduct should already be binding, no?) [01:26:24] Fair. I do agree that we need a policy, but I don't think your suggestion is broad enough. I've had a test deleted for being "unnecessarily political and derogatory" so we already have an informal one. It'd be nice to have a formal one and not leave it up to the personal opinion of the administrator (but still broad enough to allow their *judgement*) [01:31:51] What craziness is this? Are we trying to undermine our own credibility??? I fully support the speedy-deletes now listed in [[:abstract:Category:Candidates_for_speedy_deletion]] : https://tools-static.wmflabs.org/bridgebot/d5d18705/file_79355.jpg [01:37:42] I don't believe the articles should be on the wiki, but perhaps you're overreacting? The software should probably stop you from creating an article on a Wikidata item that doesn't exist. (re @u99of9: What craziness is this? Are we trying to undermine our own credibility??? I fully support the speedy-deletes now listed in [[:a...) [01:40:17] Have you considered requesting administrator rights as I sorta nominated you for on the RFUG page? I think you'd be great at it! (re @u99of9: What craziness is this? Are we trying to undermine our own credibility??? I fully support the speedy-deletes now listed in [[:a...) [01:41:51] Thanks. I've been pre-occupied with a couple of talks I was giving over the weekend. But now that they're done, I'm keen to catch up on what's been going on. To be honest I didn't expect that we would need admin action so soon! (re @Feeglgeef: Have you considered requesting administrator rights as I sorta nominated you for on the RFUG page? I think you'd be great at [01:41:51] it!) [01:42:29] Oh, I forgot about those! Are they on Commons yet? (re @u99of9: Thanks. I've been pre-occupied with a couple of talks I was giving over the weekend. But now that they're done, I'm keen to catc...) [01:43:15] Yes, some of this could be stopped technically, but I'm more disappointed that as a community we are publishing it in the first place! (re @Feeglgeef: I don't believe the articles should be on the wiki, but perhaps you're overreacting? The software should probably stop you from ...) [01:44:42] Not yet, there is a board meeting following the conference, so staff are still very busy. I expect it will be on Youtube within about a week. (re @Feeglgeef: Oh, I forgot about those! Are they on Commons yet?) [01:45:08] Alright! Please let us know when. (re @u99of9: Not yet, there is a board meeting following the conference, so staff are still very busy. I expect it will be on Youtube within ...) [02:26:12] Wikifunctions just reached 4000 functions a few hours ago [02:26:17] The 4000th function is Z33366 [03:30:10] I don't think we have sufficiently well-developed WF language functions to write a good article yet. (Let's work on that.) But I agree that the proliferation of stub articles is also not worth encouraging yet. And at the very least, these stubs should be made from very solidly designed functions that will theoretically work in all languages. I've messed around with [03:30:11] the English de [03:30:11] scriptions of Z26039 and Z26095 to try to clarify that they are not insisting on the use of particular articles in a particular language. Can we even change the labels? [03:39:59] My above post was intended as a reply to this. (re @Feeglgeef: Many on enwiki have rightfully pointed out that we have no good articles right now, thus making the project unviable and a waste...) [08:00:54] I’m all for dreaming big, but codifying the expression of relations specified by every Property type should give us a wide enough range of patterns to follow for those relations that are not conveniently expressed by an available Property type. [08:00:55] I’m not sure grammatical number or tense should be specified through the function call as either arguments or by function selection. This takes us down the path that led to confusion with articles, so I’m pretty sure they should not be. Yes, the participants in the relation may require explicit framing with regard to quantity (etc), and the relation itself may require [08:00:55] framing [08:00:56] with regard to time (etc), but I think we should avoid explicitly linking these to grammatical features (in the abstract content, that is). [08:00:58] In many cases the required framing may be intrinsic to the participants or the relation (dependent on the Property or expressed by qualifiers on Wikidata). The function call should therefore be explicit about whether it relies on the default framing implied by Wikidata content, supplements it, or overrides it. (re @u99of9: IMO using a QID (not a PID) for the verb will [08:00:58] eventually [08:00:59] greatly expand our flexibility. So I'm fine with dreaming that big. Reg...) [08:06:53] I don't see a problem with QIDs being used instead of PIDs for verbs, as I don't recall a way to connect lexemes to a specific predicate AFAIK. [08:07:55] We do have P9970. [08:09:03] Example use: [[d:Lexeme:L3636#S1-P9970]] [08:09:43] i meant property lol not predicate [08:09:47] Those statements link to QIDs not PIDs. (re @Jan_ainali: We do have P9970.) [08:10:05] P9970 connects to QIDs not PIDs [08:11:48] For PID to lexeme, you have to use P9970 + P1629. [08:12:06] At least AFAIK [08:13:53] P1629? [08:16:32] For PID to QID, because I don't know any property that connects lexemes directly with properties. [08:23:12] I think that’s right. I don’t think there should be one. (But the wikilinksbot doesn’t look up references in bridged content.) (re @wmtelegram_bot: For PID to QID, because I don't know any property that connects lexemes directly with properties.) [09:12:22] Z26039 corresponds to P31 whereas Z26095 corresponds to P279. Logically (?), the first is instantiating and the second is classifying. I’ve boldly relabelled these in English to “subject is instance of (Monolingual text)” and “subject is kind of (Monolingual text)”? Feel free to refine these, but let’s not immediately propagate the change to configurations and the lan [09:12:22] [09:12:23] guage-specific functions. (re @u99of9: I don't think we have sufficiently well-developed WF language functions to write a good article yet. (Let's work on that.) But I...) [09:26:33] This is a good change. Tomorrow I will refine a bit to try to specify that Z26095 aims for "a bird is a dinosaur", and if you prefer "birds are dinosaurs" you should make a different function, then a numerical configurator. (re @Al: Z26039 corresponds to P31 whereas Z26095 corresponds to P279. Logically (?), the first is instantiating and the second is classi...) [09:30:59] At this point I'm happy to mainly argue for some kind of abstract framing with regard to number/time/gender etc. I'd be interested in hearing how you would do this. It's especially worth thinking about how the selectors will show up to the users (enumerated types ideal). (re @Al: I’m all for dreaming big, but codifying the expression of relations specified by every [09:30:59] Property typ [09:30:59] e should give us a wide enoug...) [09:54:03] [[Wikifunctions:Type proposals/Semantic unit]] would also solve this problem (eventually through the use of wrapping creation functions) (re @u99of9: At this point I'm happy to mainly argue for some kind of abstract framing with regard to number/time/gender etc. I'd be interest...) [09:58:33] deleted (obviously) (re @u99of9: What craziness is this? Are we trying to undermine our own credibility??? I fully support the speedy-deletes now listed in [[:a...) [10:10:31] I must admit it was cleverly constructed and quite successfully rage-baiting. (re @u99of9: What craziness is this? Are we trying to undermine our own credibility??? I fully support the speedy-deletes now listed in [[:a...) [10:11:27] also, I see that there was a lot of discussion about lexemes [10:11:28] it is already possible (but indeed not often done) to link (directly or indirectly in some cases) every lexemes to an item (I'm working on Breton and French to link all of them before next Wikimania) [10:11:29] we have P5137, P9970 but also P8471, P6593, P5975, etc. (and not to mention qualifiers and such) [10:17:06] That will be very impressive and useful. I doubt that it will ever be a surjective mapping though, so we will always need labels. (re @NicolasVIGNERON: also, I see that there was a lot of discussion about lexemes [10:17:07] it is already possible (but indeed not often done) to link (directl...) [10:19:36] yeah, obviously not surjective (most items will never have a lexeme, all the scholarly articles labels - ~50 % of all items - are not lexemes) [10:19:37] but what is doable is for every lexemes to be linked to item (re @u99of9: That will be very impressive and useful. I doubt that it will ever be a surjective mapping though, so we will always need labels...) [11:56:37] In the case of number, I’m not sure enumeration is optimal, particularly when it is framed as 0 or 1. To support dual and degrees of paucal and plural, an indicative count may be more usable than a range or classification. The important thing, it seems to me, is that the framing is more a matter of knowledge representation than syntactic categorisation. And for KR, the [11:56:37] establis [11:56:38] hed patterns in Wikidata should be our invaluable guide even when the relation asserted in AW is absent from Wikidata, for whatever reason. [11:56:40] To cite an old example from Meta discussions (I think), if Judi Dench prefers to call herself an actor (in English), it may indicate a preference for an ungendered noun in other languages where that is an option. But we may choose, editorially, to disregard such a preference, particularly for a language where the gendered noun has fewer negative connotations. To me, cases [11:56:40] such as [11:56:41] this suggest that we frame the relation as an epistemic proposition and then frame the proposition in (for example) a social context and then frame the framed proposition in an editorial context. A framed proposition is still a matter of KR that could, in principle, be represented in Wikidata, but the editorial framing belongs in AW, to the extent that it is not embedded [11:56:41] in Wiki [11:56:43] functions. (re @u99of9: At this point I'm happy to mainly argue for some kind of abstract framing with regard to number/time/gender etc. I'd be interest...) [12:26:21] Apologies that I haven't engaged in any of these types of proposal yet. I've started reading and digesting. My initial impression is that they appear to require a whole lot more computation than we typically have available. (re @dvd_ccc27919: [[Wikifunctions:Type proposals/Semantic unit]] would also solve this problem (eventually through the use of wrapping creation [12:26:21] fu...) [12:39:21] What would be particularly computationally expensive? (re @u99of9: Apologies that I haven't engaged in any of these types of proposal yet. I've started reading and digesting. My initial impressio...) [12:43:06] Processing all possible forms all the way along a chain of reasoning, rather than cutting away and aiming for the one outcome we want. Even just the complexity of each object means that each step in the calls 5-10 helpers. I'm sure I haven't described it well, because at the moment it's just an impression. [12:50:01] Compare that to what we try to do now: simply deciding whether "United States" needs a definite article in English. They're not good or optimised, but look how long the implementations take on this test! Z32649 (re @u99of9: Processing all possible forms all the way along a chain of reasoning, rather than cutting away and aiming for the one outcome we...) [13:35:22] Yeah… I put a selective fetch in that test, which rather undermines the point you were making, sorry 🫤 (re @u99of9: Compare that to what we try to do now: simply deciding whether "United States" needs a definite article in English. They're not ...) [13:53:11] That's useful to see, but this function is more likely to operate on a fully fetched item, because it will be used to decide other things too. But even with your selective fetch, we can only do the equivalent of this 5 times before timeout. My impression of carrying multiple inflections etc is that it will require many more than 5 longer processes. (re @Al: Yeah… I [13:53:12] put a select [13:53:13] ive fetch in that test, which rather undermines the point you were making, sorry 🫤) [14:13:40] The trick might be to make the fetch as broad as needed at the outset, but still not a full fetch. Inflections are less of an issue because lexemes tend to be much smaller and we can probably avoid statements altogether. But for the article subject, in particular, a dedicated fetch across the core statements and (a few) linked lexemes in a handful of languages seems [14:13:40] appropriate. [14:13:40] In theory, each call would reuse the result from the first one, but currently the calls need to be identical to avoid a subsequent fetch, as I understand it. [14:13:41] Reducing the result of such a fetch to a more compact representation may well turn out to be a more theoretical optimisation than a measurable benefit, but it’s certainly worth looking at. (re @u99of9: That's useful to see, but this function is more likely to operate on a fully fetched item, because it will be used to decide oth...) [14:35:52] I don't understand where would the "processing all possible forms" happen. The proposal would only need the computation of the needed outcome (re @u99of9: Processing all possible forms all the way along a chain of reasoning, rather than cutting away and aiming for the one outcome we...) [15:11:02] that's not meant permanent. Composition v2 allows us to do caching for partial function calls, which should help with this issue at some point (re @Feeglgeef: @vrandecic your "tip" in the 3-26 status update would cause rendered content to confuse screen readers. Do you intend this to be...) [15:14:31] T421955 and T421956 (re @Feeglgeef: When I open abstract wikipedia on a mobile device with the Wikipedia app installed, I get thrown into the app and an error. Is t...) [17:03:28] Friendly reminder that this is happening in 30 minutes. See you there! (re @Sannita: Hi all! Our next Volunteers' Corner will be held on Monday, April 13, at 17:30 UTC. [17:03:29] If you have questions or ideas to discuss, ...) [18:48:21] Thank you to everyone who came to the Volunteers' Corner, as always the recording will be up in the next few days (I hope by Wednesday, if everything goes alright) [21:10:29] I hope you all had a great time, and maybe there was some interesting feedback? 😎 (re @u99of9: Thanks. I've been pre-occupied with a couple of talks I was giving over the weekend. But now that they're done, I'm keen to catc...) [21:15:51] One question about a risk that I think is valid: "are all the sentences/articles going to come out dry and boring?" (re @Al: I hope you all had a great time, and maybe there was some interesting feedback? 😎) [21:16:15] That's a feature, not a bug. (re @u99of9: One question about a risk that I think is valid: "are all the sentences/articles going to come out dry and boring?") [21:16:35] A few of the other questions were things we have plans for. (re @Al: I hope you all had a great time, and maybe there was some interesting feedback? 😎) [21:17:18] Is it? If the goal is to spread knowledge I don't see why repulsively bland articles are good. (re @Jan_ainali: That's a feature, not a bug.) [21:18:31] That's how I measure a good article on regular Wikipedia, so why not in abstract? (re @Feeglgeef: Is it? If the goal is to spread knowledge I don't see why repulsively bland articles are good.) [21:36:46] I guess what we really want is plain language; easy to understand and unambiguous. It just tends to draw towards dry and boring rather than poetic and exciting. [21:37:43] I don't mean we're making a children's book, but it should be encyclopedic-boring, not slop-boring (re @Jan_ainali: I guess what we really want is plain language; easy to understand and unambiguous. It just tends to draw towards dry and boring ...) [21:38:05] Right now we only have slop-boring, not encyclopedic-boring or entertaining [21:45:19] Yes, it’s fair… and it will be interesting to see whether text is considered more interesting in some languages than in others. Fundamentally, it may seem that using the same function with different participants in the relation will result in similar sentences, through determinism, but relations that are more richly framed can lead to linguistic variation. I think you can [21:45:20] see [21:45:20] signs of that in Mahir’s proposal. In any event, once you have reliable, encyclopaedic prose, you can always ask your preferred LLM to jazz it up for you 😏 (re @u99of9: One question about a risk that I think is valid: "are all the sentences/articles going to come out dry and boring?")