Slacor Heya. Thanks for replying here.🙂 Again, impressive forum you've been running here, for so long, with a lot of neat features.
Slacor I hope you're taking into consideration PII
So, to clarify, we've used the term "scrape", above, but I've not processed the forum info in any way or stored it myself. (Not even viewed the majority of it myself.) Google servers have taken it directly, as is publicly visible on the internet. I'd liken the process somewhat to prompting the Wayback machine to make an archive of a web page and then pointing people to that 3rd party hosted version of the information.
This said, the NotebookLM does, of course, contain people's names, locations and email addresses, etc, where they've posted them publicly in the threads I've listed (screenshot1). Regular Gemini (for example) can access this information similarly (screenshot2, 3), if prompted to do so, without uploading anything to it.
So, what are your thoughts?
I don't recall if there was anything like a service agreement when I made an account here? Couldn't find one just now. Whereas, Discord server content, for example, is not publicly visible and scraping data is technically against Discord TOS. Theoretically punishable by account banning. But in practice not really. I'm told users technically retain intellectual ownership of what they post to servers there. Although I'm not sure how enforceable that is, if so. (It's quite a shame so much helpful info is siloed away there, but that's another discussion.)
Slacor I would consider implementing a developer API
That sounds cool. Although I don't really have the personal capacity to do more than make brief use of much more than the accessibility service you've already provided. I had considered the appeal of helping out with the wiki… But, from past experience, that's a thankless task that saps all available time and energy.
I think it would be very nifty if someone made an official chatbot, of some kind, that fully indexed the whole of this forum. And automatically updated with new content (as talked about above). I think NotebookLM, with a Gemini Pro subscription, could just about fit in all the posted text information. But that would requite scraping and processing to combine multiple threads per source. To fit inside the 300 source limit (50 without subscription)…
That person is not me, right now. Sorry. And I'm not looking to make any money redistributing info. Nor pay in order to help out. How are your finances? Given how tough it is to solicit any opinion or input of any kind, in the community, I imagine donations are sadly limited too.
Slacor the cost of server load
Can I ask, are you paying for hosting? (Or self-hosting?) I hope the use of the accessibility feature didn't put too much strain on it. Maybe this can interrupt other people's page loads for some minutes? Nothing much worse?
But it's a one-time thing, to pull pages into a NotebookLM. No recurring impact from others using a shared notebook. Although there could potentially be if people used the page list I've made to set up their own instances, privately. (Unlikely at this stage, from what I've observed.)