Building a Knowledge Base

In the course of building this trial site, I experimented with searching the postings in the Google Group “archives” and copying previous threads into Discourse to add a current discussion. It took about 20-30 minutes each if the thread was long. I then used the current basic AI tool to summarize the thread and put it in the wiki category, but did not take the additional step of tagging it with keywords to make searching more effective.

I thought that if we did migrate to Discourse, one option for dealing with the current 6,000+ threads Google content would be to continue this practice over time and harvest perhaps several hundred still relevant threads, but do it in a more organized way. Rewriting and verifying the result for accuracy would add a valuable but time intensive layer.

Two new postings in one of the Discourse Evaluation discussions revisited this idea of building a curated knowledge base. I understand that this has been attempted before, but perhaps this is a time to raise this idea again.

I have moved those postings here to discuss this as a separate topic.

I would abstain from any vote as I personally don’t care enough to advocate for a switch from GG, whether to Discourse or any other tool.

To put my comment in context, I’ve been an INA member since 2004. Most of my information needs have been well served by the original Nonsuch owner’s manual and Westerbeke manuals. On rare occasion I have posted a question in GG and sometimes received useful responses. I made such posts after first searching GG to see if my question had already been posed and discussed previously. This part of the GG experience (searching and sifting, or rather trying to find the answer I needed if it already existed) is the main thing I hope a migration would improve.

I also relied on Mike Quill for assistance with rigging questions or replacement of key components.

And lately I find the effort to produce the new owners guide to be laudable. More than anything it suggests to me that an effort by some group of us to mine the existing knowledge base that GG represents and organize it into a wiki would be a great addition to the documents that currently exist. In that light, the wiki feature mentioned in the discourse on Discourse would seem to be a great tool to have going forward. Perhaps it could be leveraged to efficiently mine existing GG discussions that were gradually brought over to Discourse when we identify that a ‘new’ question is an apparent iteration on an old question.

Mike, I had similar thoughts back in December when an owner asked about options to run power to a boat without shore power. The conversation revolved around using a pigtail to connect a 30A pedestal plug to a heavy duty extension cord. The discussion contained some reasonable ideas, but also contained some opinions and assumptions about risk tolerance. I felt that the topic deserved to be summarized, so I wrote a summary ( HERE ).

I learned a couple things.

  1. Our platform discussion is blurring the distinction between reference materials and conversations. Conversations provide “hints” that an owner can consider, there is a higher standard for reference materials that need to provide reliable proven solutions. We need both conversations and reference materials. Thinking they can be equated is unrealistic.
  2. It takes some effort and knowledge to extract and summarize credible information from a conversation.
  3. While AI can summarize what is said, it hasn’t yet evolved enough to curate information ( at least to my satisfaction).
  4. INA doesn’t currently have a repeatable process to review documents for accuracy before adding them to our information repository. There is no standard way for me to submit my temporary shore power summary.

I agree that folks willing and able could build lasting value for the Nonsuch community by volunteering to identify topics of interest and writing summaries. I think we should identify, support, and encourage volunteers to start working together as a team (easier said than done). The tools they end up using will depend on their vision for the repository and their skillset.

Buying the tool is the easy part. I consult with not-for-profits and I see lots of good money wasted on tools that don’t get used because management bought them without a plan of who would use them. Volunteers like to have input into the tools they are expected to use.

My advice is :

  • Assemble the team that is committed to the goal. Information repository and discussion group are two different goals… Unless the goal is to merge them which is much more difficult.
  • Ask the team to make a plan.
  • If the plan is approved, provide whatever resources the team needs to achieve the goal.

With spring arriving I know I am not willing to volunteer unless the effort is focused and well defined. Without the plan or the volunteers, we probably don’t need to buy the tools. People first.

My gut is telling me the challenge needs to be framed differently to attract the volunteers necessary to pursue a common goal. Since we already have volunteers and two platforms ( Google & Facebook ) that promote reasonably effective conversations, it make sense to work on a plan for enhancing our information repository.

To be honest, I like sailing more than opining about collaboration platforms.

Come April… I’ll be sailing.

Go to Deepseek or ChatGPT

“Import google groups to discourse”

Follow directions

Just a suggestion but it seems to boil down to a “email” import.

Or find a youngster on upwork who will do it all for a couple hundred bucks

Ignore if annoying

Cheers

Roger Peebles

I appreciate the suggestion and I have looked into this. (I also just asked ChatGPT and got the predictable step by step). There is a “How to” topic on the Discourse support site on how to do this migration which includes over 100 replies about what didn’t and possible work arounds.

I touched base with a consultant about doing this work and got a price of $1,500 if we do some of the work (get a clean “mbox” export file, do a site export and import, etc.).

Part of the problem with doing this is, in the context of the above discussion, is that we get over 6,200 conversations of opinions and data (some of it 12 years old) and would still have to go through a review process if we want to build an actual knowledge base. Getting the old discussions would be the ‘easy’ part (not that it would be all that easy, but certainly doable).