atlas by clearpeople

Nourish your AI with good information - or risk bad information

5 April 2023
  

Rubbish in. Rubbish out.

Anyone involved in building or delivering search solutions, or enterprise content management more widely, will subscribe to this mantra. Which, if said in more diplomatic terms, means that if you don't have good content, how can you expect to have good outcomes? Such as high quality search results.

With Artificial Intelligence (AI) engines, Large Language Models (LLMs) such as ChatGPT and the Microsoft 365 Copilot system (Copilot), we no longer have to worry about that. AI will just clean up and remove bad content and then create new much better content for us. It will also organise and tag it perfectly. Right?

Sadly not.

The fact is however that with Microsoft 365 Copilot, Microsoft have created a tectonic shift in everyone's mindset of what is possible "here and now". Whether the technology becomes generally available to M365 users in 2023 or 2024 does not matter. The cat is out of the bag and you can forget about putting it back.

So I want to unpack what this means for organisations, and provide a bit of pragmatic level-setting, so you can decide and plan what is important and high-priority for your organisation, should you wish to take advantage of Copilot in the (near) future.

In my separate blog about Microsoft 365 Copilot and Business Chat you can read about the process that takes place when a user and/or an application interacts with ChatGPT via Copilot. 

In the context of this blog, the bottom-left-hand-side of this diagram (from the original Microsoft announcement) is of huge importance.

Copilot Queries the Microsoft Graph

This is the point where the AI and Copilot meets your business data in the Microsoft Graph (the Graph) – and an appreciation of what data is held in and how the Graph works is fairly important (if you feel you need a quick explanation on what Graph is, check out what ChatGPT says at the bottom of this blog).

At a high level, what happens is that Copilot sends queries (think of them as similar to search requests) to the Microsoft Graph and then uses the results of these queries in combination with the original prompt (i.e. what the user or application asked for).

Copilot is in effect using your prompt, the context of your prompt and your profile to form a query to the Graph for “organisation data” that it can use to enrich your prompt and the response to your prompt.

Let me explore this a bit further with a fictitious use case in a fictitious global property construction and engineering company called Builqo.

The bid team has just met to discuss their bid approach to win a new large programme of works (“Blue Bay Hotel and Business Park”) for an existing strategic client, “Alpha Real Estate Investments”. The programme of works they are bidding for is for a new luxury hotel and business centre in the Middle East. A recent similar project, “Green Plains”, had been run at a large loss, so there are some management concerns about going for this bid. Jaidyn from Finance, who reports to the CFO, also joined the meeting to provide financial insight to previous projects and into the profitability of this particular strategic client.

Builqo org diagram showing JaidynThe participants in the meeting were all quite senior, but no one ranked higher than Jaidyn, as she reports directly to the CFO. Jaidyn didn’t say much, but she did say “audit” and “review” a few times, and made reference to clear guidance from Kristin (CFO) and Juliet (CEO) about gross profit requirements for projects.

Anthony Giles, the Head of Bid Management, needs to brief the executives for a Bid / No-Bid decision. After the meeting, he wants a recap of the Teams meeting and has Copilot summarise it for him, with relevant tasks, actions, names of colleagues and links to relevant documents. Anthony executes the prompt, by clicking on the helpful “Recap meeting” button within Microsoft Teams.

Copilot now asks the Graph in the background for relevant organisational data, before getting it all summarised back to Anthony.

The actual query that is sent from MS Teams via Copilot will be more complex than that. I obviously have no idea exactly what it will look like, but a pseudo prompt might simplistically look something like this:

“Recap Microsoft Teams meeting [Meeting ID] for [User Account]”

We can be fairly certain that the prompt will at least include a reference to the application, the context (the meeting) and the person this all relates to.

Copilot will now take this prompt and enhance it further, by checking the Graph. I imagine it might carry out queries such as these:

    • “Find internal/external participants in the meeting [Meeting ID]”
    • “Lookup organisational relations around [User Account] to understand the relation to and reporting lines of the participants”
    • “Find content that relates to [keywords from meeting transcript]”

Anthony might then ask Copilot to draft an executive briefing in Word, and he asks Copilot to provide the initial draft based on the outcomes and decisions in the meeting.

While its fairly easy to write out the “pseudo queries” as I did above, the complexity of what Copilot is having to do and how to get authoritative sets of results should not be underestimated.

Take this query here: “Find content that relates to [keywords from meeting transcript]”

What keywords might that be?

For instance, will “Audit”, “Review” or “Gross Profit” weigh too heavily? They are all “important words or phrases”, have strong and well-understood meaning and they were said by one of the most senior participants.

What is going to guide Copilot to locate the most relevant documents returned from the Graph query?

I would expect to get information that is relevant to “Alpha Real Estate Investments” (such as previous bid information) as well as extracts from – or links to - already created materials for the “Blue Bay Hotel and Business Park” bid. And it would be useful to see lessons learned from the costly “Green Plains” project to address mitigation strategies against risk of overrun.

My belief is that Copilot will perform well if the organisation has good information management principles in place, and it will perform even better if a consistent Information Architecture and information classification (taxonomy) is in place and well adopted across Microsoft 365 content.

On the other hand, if content is all over the place, with little structure, with redundant versions and revisions, and with even less classification, how can we expect any AI, even Microsoft's own Copilot, to provide us with outcomes (such as a draft executive briefing document or a proposal) that we can regard as a “good enough” starting point from which to continue?

If, for example, all content that related to the “Green Plains” project, was automatically tagged against the “Green Plains” project code then it would be reasonable to expect Copilot to prioritise this content higher, than an email or other document which just mentions “green plains” in the textual content.

Our task – if we want to get the best out of Copilot, Business Chat, ChatGPT and other AI and LLM systems – is to at least have our house in reasonable order. If our house is in a mess and full of rubbish – then we cannot expect any more than mediocre outcomes. Rubbish in. Rubbish out.

Or, as per the title of the blog, make sure you nourish and feed your AI with good, trustworthy and well structured data, and it might actually give you some great answers.

A subsequent blog you may now want to read deals specifically with "Get your People and Content Ready for Microsoft Copilot". 

 

Looking for more blogs on AI or ChatGPT? Here are a couple of suggestions: 

 


What is the Microsoft Graph?

Prompt to OpenAI ChatGPT 3.5:
“Explain to a non-technical business leader how the Microsoft Graph holds data and is in some ways similar to a search index

microsoft graph diagramResponse from OpenAI ChatGPT 3.5:
"
Microsoft Graph is a database that connects all your Microsoft tools and services, holding data in one central location. It's similar to a search index, which organizes and stores data to make it easily accessible. When you use Microsoft tools like Outlook, Teams, or Excel, the Graph captures information like emails, files, and conversations, and stores them in a structured way. This makes it easy to search and access the information you need across all your tools. It's like having a search engine for your business data. You can also use the Graph to automate tasks like scheduling meetings or sending emails by using data from all your Microsoft tools. In short, the Graph makes it easy to manage and access data across all your Microsoft tools, similar to how a search index organizes and retrieves data from different sources."


 

Atlas: The best Digital Workplace platform for Microsoft 365

Atlas is the digital workplace platform of choice that powers Microsoft 365. 

  • Automated governance and information architecture to minimize risks of “hallucinations, invalid or irrelevant content.
  • Aggregates scattered information and make this more accessible to Microsoft Copilot
  • Seamless integration to a broader ecosystem of apps and data sources.
  • Atlas elevates the user experience of Microsoft technology by providing a visually appealing and intuitive interface.  

Find out what Atlas can do. Book a demo now.

Book a demo

Author bio

Gabriel Karawani

Gabriel Karawani

Gabriel is Co-Founder of ClearPeople, responsible for the overall technical and Atlas Intelligent Knowledge Platform vision. He works closely with colleagues at Microsoft on roadmap alignment and innovative Content AI Services programs such as Microsoft Viva Topics and SharePoint Premium (previously known as Syntex). Gabriel was part of Microsoft's partner program for Project Cortex.

View all articles by this author View all articles by this author

Get our latest posts in your inbox