In Depth: Going All In with Local LLMs
A primer on how to protect your IP, reduce costs, and avoid prying eyes on your work with AI
Networked technology and cloud computing provide the foundation for many modern conveniences—a virtually limitless catalog of songs to listen to, takeout and grocery deliveries at the tap of a button, scalable computing power, and now AI-powered assistants.
Many brands offer a "freemium" option to entice us to trial what they have on offer—a strategy where basic services are provided at no-cost, with options to upgrade to a fuller set of features for a fee. This sounds straight forward but there are fundamental elements to these transactions that are hidden from view, and it can take a bit of digging to fully understand the value exchange between customer and provider.
In well-established freemium models, like advertiser-supported broadcast television and radio, services are monetized by connecting advertisers with consumers in the context of programming that is provided at no cost to the viewer. In the digital space, services are also monetized by collecting browsing and intent data (or the derivative knowledge it provides) and using that to match consumers with relevant advertising messages.
There are other elements to the economic formula too. For example, Google can deliver highly targeted ads by collecting user browsing behavior and search history, while it also improves its ecosystem of products. This is neither novel nor new at this point and users have voted with their fingers — they continue to use products and services offered in this model.
But generative AI services can take things further as they capture something far more sensitive: our thought processes, detailed research and work notes, and our ideas. And in many cases, the data you put into these services is used to train future models, which is essential to their rapid evolution and success.
This is not speculation on my part and it’s an open secret. An AI model provider shared this with me: "It's how we learn and advance quickly. There's no other way for us to do it."
Established AI providers have an incentive to keep customer data protected and to themselves, and you may agree that giving all that data over to a provider is worth the trade off. But there are well-founded concerns that not all players in the market share the same set of values and sense of responsibility on security that we might also reasonably expect.
DeepSeek, for example, has recently come under criticism on multiple fronts:
DeepSeek has a “Tik-tok Problem”. Reporting indicates that DeekSeek can transfer data — your data — directly to the Chinese government.
DeepSeek’s responses are censored. See for yourself: Ask it about Taiwan. Bonus — open up its “thinking” window to see how it comes up with the answer. It’s instructive to see how the process is shaped by uncertainty and doubt introduced by its training data.
Security Lapses. DeepSeek’s application was caught leaking user prompts and sensitive information to the web through unsecured file stores
Just one of these issues would be reason to avoid DeepSeek, but the trio screams "avoid at all costs” to me.
Freemium or not — there's another potential challenge you may encounter with 3rd-party hosted AI models: guardrails are put on the topics and activities that are permitted on their platforms. This means that some topics are off limits.
That’s good, right?
For the average user, I believe the answer is unequivocally yes.
But ... if you’re a journalist working on a case that involves difficult subjects, or even covering hot-button topics during an election, some services may refuse to cooperate.
I saw this firsthand during the 2024 U.S. presidential election. And while I fully agree that these guardrails are necessary to stop, say, someone who is looking for information on how to build a WMD, they also prevent legitimate, well-intended use of the tools in some cases.
There is a potential fix — create a version of tools that are open to known and verified individuals, like reporters and researchers, cleared to use 3rd party AI tools with these topics. However, I'm not aware of a mainstream service that offers this option, and that means certain workforce segments can’t fully integrate hosted LLM services into their workflows.
Thankfully, there is another option that sidesteps most of these issues: locally-hosted large language models (LLMs).
A Primer on Locally Hosted LLMs
A locally-hosted LLM runs directly on your personal computer or a private server instead of relying on cloud-based 3rd party services.
By processing data and generating responses on systems that are controlled by you (or your company), you gain privacy, security, and full control over your AI interactions.
These professional groups stand to benefit from the use of locally-hosted LLMs:
Academics & Researchers – Prevent premature exposure or corporate surveillance when working on sensitive or proprietary topics.
Investigative Journalists – Avoid data collection by AI providers while covering controversial subjects or government affairs.
Legal Professionals & Corporations – Maintain confidentiality for client data and proprietary business strategies.
Creative Professionals & Writers – Protect intellectual property from being used to train external AI systems.
Government & Policy Analysts – Ensure research and policy analysis remain internal to mitigate external influence or surveillance.
While privacy and security are clear reasons to use a locally-hosted LLM, there are others as well:
More Use Case Options – Hosted AI services may refuse to assist with certain topics. For instance, before the 2024 U.S. election, some AI platforms declined to assist journalists with election-related topics. Reporters working on sensitive cases—such as investigations into violent crimes—may be blocked from using AI tools to analyze critical information and develop stories.
Reliability – The model(s) you use in your local LLM won't change unless you change them yourself which provides reliability and stability. AI providers rapidly evolve their services and models, which means the prompts that worked well for you yesterday many not work the same today. I recently experienced this when a provider that I use frequently "upgraded" to a new version – some of my customized GPTs stopped working.
Customization – You can augment the performance of your AI model with RAG (Retrieval-Augmented Generation) — or for those really into the development of AI, you can fine tune a model to your unique needs.
Locally-hosted LLMs provide the ability to use AI without interference, data mining, or reliance on companies that can change policies overnight. They also provide a degree of protection for those working in areas where political actors will take an interest in their work. While service providers may be required to turn over user data pursuant to subpoenas or legal requests, locally-hosted LLMs are less susceptible to these risks.
The Technical Side: It’s Easier Than You Think
The idea of “hosting your own AI” may sound complex, conjuring images of servers and lines of code. I thought so too at first, but in reality, setting up a local AI model is surprisingly easy. Within about 15 minutes and with the right resources, you can have a fully functional LLM running on your computer, free from cloud dependencies and surveillance risks.
What You'll Need:
Hardware – Many modern laptops (e.g., Apple Mac computers with Apple Silicon or PCs with dedicated GPUs) can run smaller AI models. Corporate users should check if their IT teams have private LLM instances available in on-premise or private cloud environments.
Software – Tools like LM Studio provide an intuitive interface, eliminating the need for coding expertise. If you are familiar with computer systems and software development, you might also check out Ollama.
Model Selection – Choosing the right AI model depends on its intended purpose. More on that below.
Quick Guide to Choosing an Open-Source LLM
Selecting the right AI model for your needs is an important step, and there are plenty of options available.
Each model is designed for different tasks, so understanding their strengths will help you make an informed decision. Whether you're looking for a generalized chat experience, code generation, or advanced reasoning, choosing the right model can significantly impact your workflow and efficiency.
Below is a guide to help you identify which open-source LLM might be best for your specific use case.
General-Purpose Chat & Assistance
Mistral 7B – A general purpose model that performs a wide range of tasks well. Small, efficient, and high-performing.
Llama 3 (Meta) – Versatile, with good reasoning and efficiency.
Gemma (Google DeepMind) – Optimized for lightweight deployment and chat capabilities.
Best for: AI-powered chatbots, customer support, and general knowledge tasks.
Code Generation & Assistance
Code Llama (Meta) – Tailored for multiple programming languages.
StarCoder (BigCode) – Ideal for Python, JavaScript, and more.
WizardCoder – Fine-tuned for improved coding problem-solving.
Best for: Writing, debugging, and understanding code.
Reasoning & Problem-Solving
Mistral Mixtral (Sparse Mixture of Experts) – Excels at multi-task learning.
DeepSeek-R1 – If you want to see DeepSeek in action without surveillance and data-leakage risks, you can download the DeepSeek-R1 model to run locally too.
Best for: Scientific research, knowledge work, and complex decision-making.
Taking the First Step
The rapid evolution of AI brings incredible opportunities, but it also raises new concerns about privacy, security, and control over our own intellectual property.
Popular hosted AI services offer convenience but come with trade-offs—data collection, restrictions, and the risk of political or corporate interference. For those who rely on AI as a critical tool in their work, the ability to run a locally hosted LLM provides security, reliability, and independence.
If you value privacy, consistency, and full control over your AI-powered workflows, now is the time to explore local AI hosting.
With the right tools (which you may already have!) and a small investment of time, you can take advantage of the latest AI advancements without compromising your data or creative process.


