Guide — ThinkHere

Getting Started

ThinkHere runs entirely in your browser. There is nothing to install and no account is required to start chatting.

No account

Step 1

Open thinkhere.ai in a supported browser

Step 2

Click Load Model to download and initialize SmolLM2 1.7B

Step 3

Start chatting — everything runs locally on your device

With a free account

Step 1

Create a free account at app.thinkhere.ai

Step 2

Choose from 8 models and click Load

Step 3

Conversations are saved automatically and can be exported

The first load downloads model weights (600 MB–4.5 GB depending on the model). After that, the model loads from your browser's cache in seconds.

Feature Tiers

ThinkHere offers three tiers. All tiers run the model locally in your browser — your conversations never leave your device.

Feature	Free	Logged In	Pro
Text model (SmolLM2 1.7B)	✓	✓	✓
Conversation history & saving	✓	✓	✓
Text file upload as context	✓	✓	✓
Image & audio file upload as context	—	✓	✓
All 8 models	—	✓	✓
Conversation export	—	✓	✓
Long conversation memory Context compression	—	✓	✓
Custom instructions System prompts	—	—	✓
Creativity settings Temperature & generation controls	—	—	✓
File-powered answers Knowledge base / RAG	—	—	✓
Voice transcription	—	—	✓

Chat Models

The free tier on thinkhere.ai includes SmolLM2 1.7B. All 8 models are available with a free account at app.thinkhere.ai.

Model	Family	Runtime	Size	Multi-modal	Tier
Gemma 4 E2B	Gemma	Transformers.js	~1.2 GB	✓	Logged In
Gemma 4 E4B	Gemma	Transformers.js	~2.1 GB	✓	Logged In
Qwen3.5 0.8B	Qwen	Transformers.js	~600 MB	✓	Logged In
Qwen3.5 2B	Qwen	Transformers.js	~1.5 GB	✓	Logged In
Qwen3.5 4B	Qwen	Transformers.js	~2.9 GB	✓	Logged In
SmolLM2 1.7B	SmolLM	WebLLM	~1 GB	—	Free
Mistral 7B	Mistral AI	WebLLM	~3.8 GB	—	Logged In
Llama 3.2 1B	Llama	WebLLM	~700 MB	—	Logged In

Model sizes are approximate download sizes. Larger models generally produce higher quality output but require more RAM and generate tokens more slowly. Models marked as multi-modal support image and PDF input alongside text.

Supporting models

ThinkHere also uses the following models behind the scenes for Pro features. All run entirely on-device via WebGPU — no data is sent to any server.

Model	Purpose	Runtime	Size	Tier
Qwen3 Embedding 0.6B	Knowledge base embeddings (RAG)	Transformers.js · ONNX Runtime	—	Pro
Whisper Large V3 Turbo	Voice transcription	Transformers.js · ONNX Runtime	—	Pro

Device Requirements

ThinkHere runs AI inference on your local hardware, so device capability matters.

Browser

Chrome 113+, Edge 113+, Safari 18+

API

WebGPU required

Memory

Minimum 4 GB RAM for SmolLM2 1.7B

Best Experience

Desktop or laptop with dedicated or integrated GPU

iPhone

Not supported — iOS memory limits prevent model loading

iPad

M-series iPads may work, though experience varies

Getting Started

1. What is ThinkHere?

ThinkHere is a private AI chat app that runs entirely in your browser on your own hardware. Instead of sending your prompts to a remote AI server, ThinkHere downloads a language model to your device and runs it locally — using your GPU via WebGPU.

The result is an AI assistant where your prompts and responses never leave your browser tab. There is no cloud inference, no server storing your conversations, and no third party processing what you write.

ThinkHere is built by Qanata Lab and core features are open source under the MIT licence.

Browser	Minimum version	Notes
Google Chrome	113+	Recommended — best WebGPU support
Microsoft Edge	113+	Chromium-based; performs similarly to Chrome
Apple Safari	18+	Supported on macOS and iPadOS (M-series)
Mozilla Firefox	—	Not currently supported; WebGPU is not stable in Firefox

Guide & Help Centre

Getting Started

No account

With a free account

Feature Tiers

Chat Models

Supporting models

Device Requirements

Getting Started

1. What is ThinkHere?

2. How to start using ThinkHere

3. Use ThinkHere without an account

4. Create a free account in ThinkHere

5. Your first model in ThinkHere

How ThinkHere Works

6. How ThinkHere runs AI in the browser

7. What WebGPU does in ThinkHere

8. What WebLLM does in ThinkHere

9. Why models are downloaded and cached locally

10. What happens when you close the tab

Privacy and Security

11. How ThinkHere protects your privacy

12. Does ThinkHere send my prompts to the cloud?

13. Who can read my conversations?

14. Where conversations are stored

15. Is ThinkHere suitable for sensitive information?

Accounts and Tiers

16. What you get with no account

17. Does signing in change where inference happens?

Models and Performance

18. Which models are available in ThinkHere?

19. Why the first model load takes time

20. How much RAM do I need?

21. How fast is ThinkHere?

22. How to choose the best model for your device

Files and Knowledge Base

23. How file upload works in ThinkHere

24. What is the ThinkHere knowledge base?

25. What RAG means in ThinkHere

26. How local-first AI affects document workflows

Device and Browser Support

27. Supported browsers for ThinkHere

28. What devices work best with ThinkHere

29. Does ThinkHere work on iPhone?

30. Does ThinkHere work on iPad?

31. How to check whether WebGPU is enabled

Troubleshooting

32. ThinkHere is stuck loading a model

33. ThinkHere feels slow

34. My browser says WebGPU is unavailable

35. I do not have enough memory to load a model

36. The app works on one browser but not another

FAQ

37. Is ThinkHere really private?

38. Do I need an account to use ThinkHere?

39. Does ThinkHere use cloud inference?

40. Why do models need to be downloaded?

41. Can I use ThinkHere on mobile?