Skip to main content
Harvey AI FAQ

Discover how Harvey’s tailored models and advanced security protocols ensure accuracy, reliability, and data protection.

Updated over 3 months ago

Overview

Harvey is a professional class AI platform designed to enhance productivity. It allows users to assign tasks in natural language, similar to working with a colleague, while providing relevant documents and data for context.

Harvey supports a wide range of tasks, from drafting contracts to conducting research and analysis, making workflows more efficient. We continually expand Harvey’s capabilities to meet the needs of various practices, including data rooms and e-discovery, ensuring comprehensive support for professional services.

For a detailed overview, visit our ‘Introduction to Harvey’ article.


AI + Harvey

Q: What does ‘Generative AI’ do?

Generative AI refers to a class of artificial intelligence systems designed to generate new content, ideas, or solutions based on patterns and data it has learned from. It uses machine learning models to create text, images, music, and even code or designs, mimicking human creativity. Unlike traditional AI, which focuses on recognizing patterns or making decisions based on existing data, generative AI can produce novel outputs, often by learning from vast datasets.

Q: How is Harvey different from ChatGPT?

Harvey specializes in complex professional tasks, particularly in legal and tax workflows.

Q: What are Harvey's knowledge cut off dates?

The knowledge cutoff dates are October 2023 for Assist mode and October 2021 for Draft mode in Assistant.

Q: Does Harvey require extensive “prompt engineering”?

No, Harvey minimizes the need for detailed prompt engineering. Most tasks can be completed with simple but straightforward instructions, and we provide examples to guide users.

Think of Harvey as a digital associate—a fast and effective thought partner and generator of first drafts. While the output requires verification, similar to the work of a junior colleague, it saves significant time and enhances overall quality. Harvey also allows you to save favorite prompts for future use, expediting workflows and saving you time.

For more information, visit our Prompt Writing article.

Q: What are hallucinations?

Hallucinations are a type of error that occurs when LLMs (Large Language Models) generate inaccurate or fabricated information. This can happen because LLMs are trained on large and diverse corpora of text, but may lack sufficient domain knowledge or logical reasoning.

Q:How do you prevent hallucinations?

Harvey minimizes hallucinations by using domain-specific models and knowledge bases:

  • Domain-Specific Models: These models are trained on large datasets of specialized legal documents, capturing the nuances and complexities of the law. This narrows the gap between general-purpose models, which lack domain expertise, and human experts with specialized knowledge.

  • Knowledge Bases: Many of Harvey’s tools use domain-specific resources, such as statutes, case law, and legal ontologies, to ground responses in authoritative sources. For legal tasks, Harvey references selected case law and statutes, and when documents are uploaded, the output will include citations from the user-provided materials.


Harvey’s AI Models

Q: Are Harveys models trained on legal data?

Yes. Harvey’s models are fine-tuned using large datasets from legal documents, user actions, and specialist feedback.

  • Legal document data: We build large proprietary datasets of legal data that are representative of the tasks faced by our clients.

  • Legal behavioral data: We collect large datasets of all the actions lawyers take to complete a task in order to train our systems to act like a lawyer. This approach is often called imitation learning and is how systems like AlphaGo or Tesla’s Autopilot learn from vast amounts of human gameplay or driving data.

  • Specialist feedback data: We collect large datasets of lawyer or specialist “edits”. Think of this similar to when a senior lawyer gives feedback to a junior lawyer. This data is used to perform Reinforcement Learning from Human Feedback.

Q: Is Harvey trained on my data?

No. Harvey does not use customer data or content for training. We follow strict data exclusion policies and leverage the expertise of a security advisory board composed of eminent CISOs from leading financial and cloud entities. We also implement logical data segregation and stringent internal controls to preclude any data cross-contamination.

Q:What language models does Harvey use?

Harvey uses the best language models for each task, selecting based on performance. We currently have a deep partnership with OpenAI, but in the future will integrate other models to ensure top performance.

We monitor the performance of vector embedding models and use the most effective ones in our tools, updating them as needed to reflect ongoing developments. Harvey’s strong relationships with leading model developers, along with its deep bench of engineering and machine learning talent, ensure that Harvey continually and securely leverages the latest advancements.


Data Security

Q: Is Harvey secure?

Yes. Harvey employs rigorous security measures to ensure the confidentiality, integrity, and availability of customer data. We maintain SOC 2 Type 2 and ISO 27001 certifications, ensuring compliance with GDPR and CCPA.

Hosted on Microsoft Azure, Harvey benefits from Microsoft’s best-in-class security features, including physical security measures at data centers. Data is encrypted both at rest using AES 256 encryption and in transit over public networks with TLS 1.2 or above.

Q: How does Harvey approach security?

Harvey takes a multi-layered approach to security, focusing on both preventive measures and rapid response capabilities. Key features include robust internal authentication and access controls, with unique user identifiers and hardware-backed FIDO2 multi-factor authentication for all personnel. Employee devices are centrally managed to ensure compliance with security policies, and data backups are redundantly stored across multiple U.S. data centers, unless specified otherwise.

Harvey’s secure development lifecycle follows industry best practices, including thorough code reviews, secure coding standards, and both static and dynamic application security testing. Additionally, Harvey uses network segmentation, firewalls, and secure web application protocols to further protect user data.


For more information, visit our Security Portal.

Did this answer your question?