`agent_k.agents.scientist`

The SCIENTIST research agent module.

agent_k.agents.scientist

Scientist agent - research and analysis for AGENT-K.

@notice: | Scientist agent - research and analysis for AGENT-K.

@dev: | See module for implementation details and extension points.

@graph: id: agent_k.agents.scientist provides: - agent_k.agents.scientist:ScientistAgent - agent_k.agents.scientist:ScientistDeps - agent_k.agents.scientist:ScientistSettings - agent_k.agents.scientist:ResearchReport - agent_k.agents.scientist:ResearchFinding - agent_k.agents.scientist:LeaderboardAnalysis - agent_k.agents.scientist:get_external_data_policy - agent_k.agents.scientist:scientist_agent consumes: - agent_k.core.protocols:PlatformAdapter - agent_k.toolsets.kaggle:kaggle_toolset - agent_k.toolsets.search:search_toolset - agent_k.toolsets.browser:browser_toolset pattern: agent-singleton

@similar: - id: agent_k.agents.lobbyist when: "Use for competition discovery rather than research synthesis." - id: agent_k.agents.evolver when: "Use for solution optimization rather than research."

@agent-guidance: do: - "Use agent_k.agents.scientist as the canonical home for this capability." do_not: - "Create parallel modules without updating @similar or @graph."

@human-review: last-verified: 2026-01-26 owners: - agent-k-core

(c) Mike Casale 2025. Licensed under the MIT License.

ScientistSettings

Bases: BaseSettings

Configuration for the Scientist agent.

@pattern: name: settings rationale: "Centralizes research agent configuration." violations: "Per-run overrides lead to inconsistent outputs."

Source code in agent_k/agents/scientist.py

class ScientistSettings(BaseSettings):
    """Configuration for the Scientist agent.

    @pattern:
        name: settings
        rationale: "Centralizes research agent configuration."
        violations: "Per-run overrides lead to inconsistent outputs."
    """

    model_config = SettingsConfigDict(env_prefix="SCIENTIST_", env_file=".env", extra="ignore", validate_default=True)
    model: str = Field(default=DEFAULT_MODEL, description="Model identifier for research tasks")
    temperature: float = Field(default=0.3, ge=0.0, le=2.0, description="Sampling temperature for research prompts")
    max_tokens: int = Field(default=4096, ge=1, description="Maximum tokens for responses")
    tool_retries: int = Field(default=2, ge=0, description="Tool retry attempts")
    output_retries: int = Field(default=1, ge=0, description="Output validation retry attempts")
    max_paper_results: int = Field(default=10, ge=1, description="Maximum papers to retrieve")
    max_notebook_results: int = Field(default=10, ge=1, description="Maximum notebooks to retrieve")

    @property
    def model_settings(self) -> ModelSettings:
        """Build ModelSettings for the configured model."""
        return ModelSettings(temperature=self.temperature, max_tokens=self.max_tokens)

model_settings `property`

model_settings: ModelSettings

Build ModelSettings for the configured model.

ScientistDeps `dataclass`

Dependencies for the Scientist agent.

@pattern: name: dependency-container rationale: "Groups runtime services for research tools." violations: "Hidden globals make research runs nondeterministic."

@collaborators: required: - httpx:AsyncClient - agent_k.core.protocols:PlatformAdapter - agent_k.core.models:Competition optional: - agent_k.core.models:LeaderboardEntry injection: constructor lifecycle: "Allocated per agent run."

Source code in agent_k/agents/scientist.py

@dataclass
class ScientistDeps:
    """Dependencies for the Scientist agent.

    @pattern:
        name: dependency-container
        rationale: "Groups runtime services for research tools."
        violations: "Hidden globals make research runs nondeterministic."

    @collaborators:
        required:
            - httpx:AsyncClient
            - agent_k.core.protocols:PlatformAdapter
            - agent_k.core.models:Competition
        optional:
            - agent_k.core.models:LeaderboardEntry
        injection: constructor
        lifecycle: "Allocated per agent run."
    """

    http_client: httpx.AsyncClient
    platform_adapter: PlatformAdapter
    competition: Competition
    leaderboard: list[LeaderboardEntry] = field(default_factory=list)
    research_cache: dict[str, Any] = field(default_factory=dict)

    async def refresh_leaderboard(self) -> None:
        """Refresh leaderboard from the platform.

        @notice: |
            Updates the cached leaderboard in-place.

        @effects:
            io:
                - Kaggle API request
            state:
                - self.leaderboard
        """
        self.leaderboard = await self.platform_adapter.get_leaderboard(self.competition.id, limit=100)

refresh_leaderboard `async`

refresh_leaderboard() -> None

Refresh leaderboard from the platform.

@notice: | Updates the cached leaderboard in-place.

@effects: io: - Kaggle API request state: - self.leaderboard

Source code in agent_k/agents/scientist.py

async def refresh_leaderboard(self) -> None:
    """Refresh leaderboard from the platform.

    @notice: |
        Updates the cached leaderboard in-place.

    @effects:
        io:
            - Kaggle API request
        state:
            - self.leaderboard
    """
    self.leaderboard = await self.platform_adapter.get_leaderboard(self.competition.id, limit=100)

ResearchFinding

Bases: BaseModel

Individual research finding.

@pattern: name: output-model rationale: "Consistent schema for research findings." violations: "Ad-hoc dicts are hard to validate."

Source code in agent_k/agents/scientist.py

class ResearchFinding(BaseModel):
    """Individual research finding.

    @pattern:
        name: output-model
        rationale: "Consistent schema for research findings."
        violations: "Ad-hoc dicts are hard to validate."
    """

    model_config = ConfigDict(frozen=True, str_strip_whitespace=True, validate_default=True)
    schema_version: str = Field(default=SCHEMA_VERSION, description="Schema version")
    category: str = Field(description="Category of finding")
    title: str = Field(description="Brief title")
    summary: str = Field(description="Detailed summary")
    relevance_score: float = Field(ge=0, le=1, description="Relevance to competition")
    sources: list[str] = Field(default_factory=list, description="Source URLs")

LeaderboardAnalysis

Bases: BaseModel

Analysis of competition leaderboard.

@pattern: name: output-model rationale: "Stable schema for leaderboard summaries." violations: "Inconsistent shapes break downstream synthesis."

Source code in agent_k/agents/scientist.py

class LeaderboardAnalysis(BaseModel):
    """Analysis of competition leaderboard.

    @pattern:
        name: output-model
        rationale: "Stable schema for leaderboard summaries."
        violations: "Inconsistent shapes break downstream synthesis."
    """

    model_config = ConfigDict(frozen=True, str_strip_whitespace=True, validate_default=True)
    schema_version: str = Field(default=SCHEMA_VERSION, description="Schema version")
    top_score: float = Field(description="Best leaderboard score")
    median_score: float = Field(description="Median leaderboard score")
    score_distribution: str = Field(description="Description of score distribution")
    common_approaches: list[str] = Field(description="Inferred common approaches")
    improvement_opportunities: list[str] = Field(description="Potential improvement areas")

ResearchReport

Bases: BaseModel

Complete research report for a competition.

@pattern: name: output-model rationale: "Aggregates research findings into a stable schema." violations: "Free-form outputs hinder automation."

Source code in agent_k/agents/scientist.py

class ResearchReport(BaseModel):
    """Complete research report for a competition.

    @pattern:
        name: output-model
        rationale: "Aggregates research findings into a stable schema."
        violations: "Free-form outputs hinder automation."
    """

    model_config = ConfigDict(frozen=True, str_strip_whitespace=True, validate_default=True)
    schema_version: str = Field(default=SCHEMA_VERSION, description="Schema version")
    competition_id: str = Field(description="Competition identifier")
    domain_findings: list[ResearchFinding] = Field(
        default_factory=list, description="Domain-specific research findings"
    )
    technique_findings: list[ResearchFinding] = Field(
        default_factory=list, description="Technique-focused research findings"
    )
    leaderboard_analysis: LeaderboardAnalysis | None = Field(default=None, description="Leaderboard analysis summary")
    recommended_approaches: list[str] = Field(default_factory=list, description="Recommended modeling approaches")
    estimated_baseline_score: float | None = Field(default=None, description="Estimated baseline score")
    key_challenges: list[str] = Field(default_factory=list, description="Primary competition challenges")

ScientistAgent

Bases: MemoryMixin

Scientist agent encapsulating research and analysis functionality.

@notice: | Performs leaderboard, literature, and notebook analysis. Use the module-level scientist_agent or agent registry.

@dev: | Registers research tools and synthesizes outputs into ResearchReport.

@pattern: name: agent-singleton rationale: "Single instance keeps memory/tool registration consistent." violations: "Multiple instances duplicate tool registrations."

@collaborators: required: - agent_k.core.protocols:PlatformAdapter - agent_k.core.models:Competition optional: - httpx:AsyncClient injection: deps via RunContext lifecycle: "Module-level singleton at import time."

@concurrency: model: asyncio safe: false reason: "Mutates internal caches and tool registration."

@invariants: - "self.agent is initialized after __init_ completes." - "self._toolset registers research tools exactly once."

Source code in agent_k/agents/scientist.py

class ScientistAgent(MemoryMixin):
    """Scientist agent encapsulating research and analysis functionality.

    @notice: |
        Performs leaderboard, literature, and notebook analysis.
        Use the module-level scientist_agent or agent registry.

    @dev: |
        Registers research tools and synthesizes outputs into ResearchReport.

    @pattern:
        name: agent-singleton
        rationale: "Single instance keeps memory/tool registration consistent."
        violations: "Multiple instances duplicate tool registrations."

    @collaborators:
        required:
            - agent_k.core.protocols:PlatformAdapter
            - agent_k.core.models:Competition
        optional:
            - httpx:AsyncClient
        injection: deps via RunContext
        lifecycle: "Module-level singleton at import time."

    @concurrency:
        model: asyncio
        safe: false
        reason: "Mutates internal caches and tool registration."

    @invariants:
        - "self._agent is initialized after __init__ completes."
        - "self._toolset registers research tools exactly once."
    """

    def __init__(
        self, settings: Annotated[ScientistSettings | None, Doc("Optional settings override.")] = None
    ) -> None:
        """Initialize the Scientist agent.

        @notice: |
            Builds the agent singleton and registers tools.

        @dev: |
            Initializes memory backend, toolset, and pydantic-ai Agent.

        @state-changes:
            - self._settings
            - self._toolset
            - self._agent
        """
        self._settings = settings or ScientistSettings()
        self._toolset: FunctionToolset[ScientistDeps] = FunctionToolset(id="scientist")
        self._memory_backend = self._init_memory_backend()
        self._register_tools()
        self._agent = self._create_agent()
        register_agent("scientist", self._agent)
        self._setup_memory()

    @property
    def agent(self) -> Agent[ScientistDeps, ResearchReport]:
        """Return the underlying pydantic-ai Agent."""
        return self._agent

    @property
    def settings(self) -> ScientistSettings:
        """Return current settings."""
        return self._settings

    async def analyze_leaderboard(
        self,
        ctx: RunContext[ScientistDeps],
        refresh: Annotated[bool, Doc("Whether to refresh leaderboard before analysis.")] = True,
    ) -> dict[str, Any]:
        """Analyze the current competition leaderboard.

        @notice: |
            Computes distribution stats and common approaches from leaderboard entries.

        @effects:
            io:
                - Kaggle API request (optional)
            state:
                - ctx.deps.leaderboard
        """
        with logfire.span("scientist.analyze_leaderboard"):
            if refresh:
                await ctx.deps.refresh_leaderboard()

            leaderboard = ctx.deps.leaderboard
            if not leaderboard:
                return {"error": "No leaderboard data available"}

            scores = [e.score for e in leaderboard]
            return {
                "total_teams": len(leaderboard),
                "top_score": max(scores),
                "median_score": sorted(scores)[len(scores) // 2],
                "score_range": max(scores) - min(scores),
                "top_10_scores": [e.score for e in leaderboard[:10]],
                "top_teams": [{"rank": e.rank, "team": e.team_name, "score": e.score} for e in leaderboard[:10]],
            }

    async def get_kaggle_notebooks(
        self,
        ctx: RunContext[ScientistDeps],
        sort_by: Annotated[str, Doc("Sort key for Kaggle notebooks.")] = "voteCount",
        max_results: Annotated[int, Doc("Maximum notebooks to retrieve."), Range(1, 50)] = 10,
    ) -> list[dict[str, Any]]:
        """Get top notebooks for the competition.

        @notice: |
            Retrieves competition notebooks ordered by vote count or freshness.

        @effects:
            io:
                - Kaggle API request
        """
        with logfire.span("scientist.get_notebooks"):
            notebooks = await self._fetch_kaggle_notebooks(ctx, sort_by=sort_by, max_results=max_results)
            if notebooks:
                return notebooks

            await ctx.deps.refresh_leaderboard()
            return [
                {
                    "title": f"{ctx.deps.competition.title} solution by {entry.team_name}",
                    "votes": max(1, (len(ctx.deps.leaderboard) - entry.rank + 1) * 5),
                    "author": entry.team_name,
                    "techniques": self._infer_techniques_from_text(" ".join(ctx.deps.competition.tags)),
                }
                for entry in ctx.deps.leaderboard[:max_results]
            ]

    async def analyze_top_kernels(
        self,
        ctx: RunContext[ScientistDeps],
        sort_by: Annotated[str, Doc("Sort key for Kaggle kernels.")] = "voteCount",
        max_results: Annotated[int, Doc("Maximum kernels to analyze."), Range(1, 25)] = 5,
    ) -> dict[str, Any]:
        """Analyze top Kaggle kernels for techniques and patterns.

        @notice: |
            Downloads and summarizes kernel code to identify techniques.

        @effects:
            io:
                - Kaggle API request
        """
        with logfire.span("scientist.analyze_top_kernels"):
            notebooks = await self._fetch_kaggle_notebooks(ctx, sort_by=sort_by, max_results=max_results)
            analyses: list[dict[str, Any]] = []
            for notebook in notebooks[:max_results]:
                kernel_ref = notebook.get("ref") or self._extract_kernel_ref(notebook.get("url", ""))
                code = await self._fetch_kernel_code(ctx, kernel_ref) if kernel_ref else None
                extracted = (
                    self._extract_techniques_from_code(code)
                    if code
                    else {"techniques": notebook.get("techniques", []), "target_transforms": [], "stacking": []}
                )
                analyses.append(
                    {
                        "title": notebook.get("title"),
                        "author": notebook.get("author"),
                        "votes": notebook.get("votes"),
                        "url": notebook.get("url"),
                        "ref": kernel_ref,
                        "techniques": extracted["techniques"],
                        "target_transforms": extracted["target_transforms"],
                        "stacking": extracted["stacking"],
                        "code_available": bool(code),
                    }
                )
            return {"kernels": analyses, "summary": self._summarize_kernel_analysis(analyses)}

    async def extract_techniques(
        self, ctx: RunContext[ScientistDeps], kernel_code: Annotated[str, Doc("Raw kernel code to analyze.")]
    ) -> dict[str, Any]:
        """Extract modeling techniques from Kaggle kernel code.

        @notice: |
            Scans code for common ML techniques and feature engineering patterns.

        @effects:
            state:
                - none
        """
        _ = ctx
        return self._extract_techniques_from_code(kernel_code)

    async def synthesize_strategy(
        self,
        ctx: RunContext[ScientistDeps],
        techniques: Annotated[list[str] | None, Doc("Extracted modeling techniques.")] = None,
        target_transforms: Annotated[list[str] | None, Doc("Candidate target transforms.")] = None,
        stacking: Annotated[list[str] | None, Doc("Stacking or blending hints.")] = None,
    ) -> dict[str, Any]:
        """Synthesize a prioritized strategy from extracted techniques.

        @notice: |
            Aggregates extracted techniques into prioritized action items.

        @effects:
            state:
                - none
        """
        _ = ctx
        techniques = techniques or []
        target_transforms = target_transforms or []
        stacking = stacking or []
        plan: list[dict[str, Any]] = []
        priority = 1

        if target_transforms:
            plan.append(
                {
                    "priority": priority,
                    "action": "target_transform",
                    "details": f"Evaluate {', '.join(sorted(set(target_transforms)))} for the target.",
                }
            )
            priority += 1

        if stacking or "stacking" in techniques or "blending" in techniques:
            plan.append(
                {
                    "priority": priority,
                    "action": "stacking_blending",
                    "details": "Prototype a simple stacking/blending ensemble with diverse base models.",
                }
            )
            priority += 1

        if "feature_scaling" in techniques or "polynomial_features" in techniques or "binning" in techniques:
            plan.append(
                {
                    "priority": priority,
                    "action": "feature_engineering",
                    "details": "Prioritize scaling, polynomial interactions, and binning for numeric features.",
                }
            )
            priority += 1

        if "lightgbm" in techniques:
            plan.append(
                {
                    "priority": priority,
                    "action": "lightgbm_objectives",
                    "details": "Tune LightGBM objectives with quantile or huber settings for robustness.",
                }
            )
            priority += 1

        return {"plan": plan, "signals": {"techniques": techniques, "stacking": stacking}}

    async def analyze_data_characteristics(self, ctx: RunContext[ScientistDeps]) -> dict[str, Any]:
        """Analyze competition data characteristics.

        @notice: |
            Profiles datasets and derives preprocessing hints.

        @effects:
            io:
                - Kaggle API request
                - local filesystem access
        """
        with logfire.span("scientist.analyze_data"):
            try:
                with tempfile.TemporaryDirectory() as tmp_dir:
                    files = await ctx.deps.platform_adapter.download_data(ctx.deps.competition.id, tmp_dir)
                    summary = self._summarize_dataset(files)
                    profile: DatasetProfile | None = None
                    try:
                        train_path, test_path, sample_path = locate_data_files(files)
                        profile = build_dataset_profile(train_path, test_path, sample_path)
                        hints = generate_preprocessing_hints(profile, ctx.deps.competition.id)
                        summary["dataset_profile"] = profile.to_dict()
                        summary["preprocessing_hints"] = [hint.to_dict() for hint in hints]
                    except Exception as exc:
                        logfire.warning("dataset_profile_failed", error=str(exc))
                    try:
                        external_data = await get_external_data_policy(
                            ctx.deps.http_client,
                            ctx.deps.competition.id,
                            profile=profile,
                            cache=ctx.deps.research_cache,
                        )
                        summary["external_data_rules"] = external_data
                    except Exception as exc:
                        logfire.warning("external_data_rules_failed", error=str(exc))
                    return summary
            except Exception as exc:
                logfire.warning("data_analysis_failed", error=str(exc))
                return self._fallback_dataset_summary(ctx.deps.competition)

    async def check_external_data_rules(
        self,
        ctx: RunContext[ScientistDeps],
        profile: Annotated[dict[str, Any] | None, Doc("Dataset profile snapshot, if available.")] = None,
    ) -> dict[str, Any]:
        """Check competition rules for external data usage.

        @notice: |
            Fetches rules and infers external data policy.

        @effects:
            io:
                - Kaggle API request
        """
        dataset_profile = DatasetProfile.from_dict(profile) if isinstance(profile, dict) else None
        return await get_external_data_policy(
            ctx.deps.http_client, ctx.deps.competition.id, profile=dataset_profile, cache=ctx.deps.research_cache
        )

    async def compute_baseline_estimate(
        self,
        ctx: RunContext[ScientistDeps],
        leaderboard_scores: Annotated[list[float], Doc("Leaderboard scores to summarize.")],
        competition_difficulty: Annotated[str, Doc("Difficulty label (easy/medium/hard).")],
    ) -> float:
        """Estimate achievable baseline score.

        @notice: |
            Uses leaderboard median and a difficulty multiplier.

        @effects:
            state:
                - none
        """
        _ = ctx
        if not leaderboard_scores:
            return 0.0

        median = sorted(leaderboard_scores)[len(leaderboard_scores) // 2]
        difficulty_multiplier = {"easy": 0.95, "medium": 0.85, "hard": 0.70}.get(competition_difficulty, 0.80)

        return median * difficulty_multiplier

    def _create_agent(self) -> Agent[ScientistDeps, ResearchReport]:
        """Create the underlying pydantic-ai agent.

        @factory-for:
            id: agent_k.agents.scientist:ScientistAgent
            rationale: "Centralizes agent wiring and toolset preparation."
            singleton: true
            cache-key: "module"

        @canonical-home:
            for:
                - "scientist agent construction"
            notes: "Use ScientistAgent() or module singleton."
        """
        builtin_tools: list[Any] = [prepare_web_search, prepare_web_fetch]
        if self._memory_backend is not None:
            builtin_tools.append(prepare_memory_tool)

        agent: Agent[ScientistDeps, ResearchReport] = Agent(
            model=get_model(self._settings.model),
            deps_type=ScientistDeps,
            output_type=ResearchReport,
            instructions=SCIENTIST_SYSTEM_PROMPT,
            name="scientist",
            model_settings=self._settings.model_settings,
            retries=self._settings.tool_retries,
            output_retries=self._settings.output_retries,
            builtin_tools=builtin_tools,
            toolsets=[
                create_production_toolset([self._toolset, cast("FunctionToolset[ScientistDeps]", kaggle_toolset)])
            ],
            prepare_tools=universal_tool_preparation,
            instrument=True,
        )

        agent.output_validator(self._validate_research_completeness)
        agent.instructions(self._add_competition_context)

        return agent

    def _register_tools(self) -> None:
        """Register all research tools with the toolset."""
        self._toolset.tool(self.analyze_leaderboard)
        self._toolset.tool(self.get_kaggle_notebooks)
        self._toolset.tool(self.analyze_top_kernels)
        self._toolset.tool(self.extract_techniques)
        self._toolset.tool(self.synthesize_strategy)
        self._toolset.tool(self.analyze_data_characteristics)
        self._toolset.tool(self.check_external_data_rules)
        self._toolset.tool(self.compute_baseline_estimate)

    async def _validate_research_completeness(
        self, ctx: RunContext[ScientistDeps], output: ResearchReport
    ) -> ResearchReport:
        """Validate research report completeness."""
        if ctx.partial_output:
            return output
        if not output.recommended_approaches:
            raise ModelRetry("Research must include recommended approaches.")
        if not output.domain_findings and not output.technique_findings:
            raise ModelRetry("Research must include at least one finding.")
        return output

    async def _add_competition_context(self, ctx: RunContext[ScientistDeps]) -> str:
        """Add competition-specific context to instructions."""
        comp = ctx.deps.competition
        prize = f"${comp.prize_pool:,}" if comp.prize_pool else "N/A"
        tags = ", ".join(comp.tags) if comp.tags else "None"
        return (
            "CURRENT COMPETITION:\n"
            f"- ID: {comp.id}\n"
            f"- Title: {comp.title}\n"
            f"- Type: {comp.competition_type.value}\n"
            f"- Metric: {comp.metric.value} ({comp.metric_direction})\n"
            f"- Days Remaining: {comp.days_remaining}\n"
            f"- Prize Pool: {prize}\n"
            f"- Tags: {tags}"
        )

    async def _fetch_kaggle_notebooks(
        self, ctx: RunContext[ScientistDeps], *, sort_by: str, max_results: int
    ) -> list[dict[str, Any]]:
        params: dict[str, str | int] = {
            "competition": ctx.deps.competition.id,
            "sortBy": sort_by,
            "pageSize": max_results,
        }

        auth = self._extract_kaggle_auth(ctx.deps.platform_adapter)
        if not auth:
            return []

        response = await ctx.deps.http_client.get(_KAGGLE_KERNELS_ENDPOINT, params=params, auth=auth)
        if response.status_code != 200:
            return []

        results: list[dict[str, Any]] = []
        for item in response.json():
            ref = item.get("ref") or item.get("id") or item.get("kernelId")
            results.append(
                {
                    "title": item.get("title", ""),
                    "votes": item.get("voteCount", 0),
                    "author": item.get("author", ""),
                    "techniques": self._infer_techniques_from_text(
                        f"{item.get('title', '')} {item.get('scriptVersionTitle', '')}"
                    ),
                    "url": item.get("url", ""),
                    "ref": ref,
                }
            )
        return results

    async def _fetch_kernel_code(self, ctx: RunContext[ScientistDeps], kernel_ref: str) -> str | None:
        auth = self._extract_kaggle_auth(ctx.deps.platform_adapter)
        if not auth:
            return None
        for key in ("kernelId", "kernelSlug"):
            response = await ctx.deps.http_client.get(_KAGGLE_KERNEL_VIEW_ENDPOINT, params={key: kernel_ref}, auth=auth)
            if response.status_code != 200:
                continue
            payload = response.json()
            for code_key in _KAGGLE_KERNEL_CODE_KEYS:
                value = payload.get(code_key)
                if isinstance(value, str) and value.strip():
                    return value
        return None

    def _extract_kernel_ref(self, url: str) -> str | None:
        if not url:
            return None
        match = re.search(r"kaggle\\.com/(?:code/)?(?P<owner>[^/]+)/(?P<slug>[^/?#]+)", url)
        if not match:
            return None
        return f"{match.group('owner')}/{match.group('slug')}"

    def _extract_techniques_from_code(self, kernel_code: str) -> dict[str, Any]:
        techniques: list[str] = []
        target_transforms: list[str] = []
        stacking: list[str] = []
        for name, pattern in _KERNEL_TECHNIQUE_PATTERNS.items():
            if not kernel_code or not pattern.search(kernel_code):
                continue
            if name in {"log1p", "boxcox", "yeojohnson"}:
                target_transforms.append(name)
            elif name in {"stacking", "blending"}:
                stacking.append(name)
            else:
                techniques.append(name)
        return {
            "techniques": sorted(set(techniques)),
            "target_transforms": sorted(set(target_transforms)),
            "stacking": sorted(set(stacking)),
        }

    def _summarize_kernel_analysis(self, analyses: list[dict[str, Any]]) -> dict[str, Any]:
        counts: dict[str, int] = {}
        targets: dict[str, int] = {}
        for entry in analyses:
            for technique in entry.get("techniques", []):
                counts[technique] = counts.get(technique, 0) + 1
            for transform in entry.get("target_transforms", []):
                targets[transform] = targets.get(transform, 0) + 1
        return {"technique_counts": counts, "target_transform_counts": targets, "total_kernels": len(analyses)}

    def _extract_kaggle_auth(self, adapter: PlatformAdapter) -> tuple[str, str] | None:
        if not hasattr(adapter, "config"):
            return None
        config = adapter.config
        username, api_key = getattr(config, "username", None), getattr(config, "api_key", None)
        return (username, api_key) if username and api_key else None

    def _infer_techniques_from_text(self, text: str) -> list[str]:
        lower_text = text.lower()
        techniques = []
        for keyword, technique in _DEFAULT_NOTEBOOK_TECHNIQUES.items():
            if keyword in lower_text and technique not in techniques:
                techniques.append(technique)
        return techniques

    def _summarize_dataset(self, files: list[str]) -> dict[str, Any]:
        summary: dict[str, Any] = {"files": [], "total_size_mb": 0.0}

        for file_path in files:
            path = Path(file_path)
            if not path.exists():
                continue

            file_info: dict[str, Any] = {"name": path.name, "size_mb": round(path.stat().st_size / (1024 * 1024), 2)}
            summary["total_size_mb"] += file_info["size_mb"]
            if path.suffix.lower() == ".csv":
                file_info.update(self._summarize_csv(path))

            summary["files"].append(file_info)

        summary["total_size_mb"] = round(summary["total_size_mb"], 2)
        return summary

    def _summarize_csv(self, path: Path) -> dict[str, Any]:
        with path.open("r", encoding="utf-8", newline="") as handle:
            reader = csv.reader(handle)
            rows = list(reader)

        if not rows:
            return {"row_count": 0, "column_count": 0}

        header = rows[0]
        sample_rows = rows[1:101]
        missing_counts = {col: 0 for col in header}
        for row in sample_rows:
            for col, value in zip(header, row, strict=False):
                if value.strip().lower() in _MISSING_VALUE_TOKENS:
                    missing_counts[col] += 1

        return {
            "row_count": len(rows) - 1,
            "column_count": len(header),
            "columns": header,
            "missing_values": {col: count for col, count in missing_counts.items() if count > 0},
        }

    def _fallback_dataset_summary(self, competition: Competition) -> dict[str, Any]:
        return {"files": [], "total_size_mb": 0.0, "notes": f"Dataset summary unavailable for {competition.title}"}

init

__init__(
    settings: Annotated[
        ScientistSettings | None,
        Doc("Optional settings override."),
    ] = None,
) -> None

Initialize the Scientist agent.

@notice: | Builds the agent singleton and registers tools.

@dev: | Initializes memory backend, toolset, and pydantic-ai Agent.

@state-changes: - self._settings - self._toolset - self._agent

Source code in agent_k/agents/scientist.py

def __init__(
    self, settings: Annotated[ScientistSettings | None, Doc("Optional settings override.")] = None
) -> None:
    """Initialize the Scientist agent.

    @notice: |
        Builds the agent singleton and registers tools.

    @dev: |
        Initializes memory backend, toolset, and pydantic-ai Agent.

    @state-changes:
        - self._settings
        - self._toolset
        - self._agent
    """
    self._settings = settings or ScientistSettings()
    self._toolset: FunctionToolset[ScientistDeps] = FunctionToolset(id="scientist")
    self._memory_backend = self._init_memory_backend()
    self._register_tools()
    self._agent = self._create_agent()
    register_agent("scientist", self._agent)
    self._setup_memory()

agent `property`

agent: Agent[ScientistDeps, ResearchReport]

Return the underlying pydantic-ai Agent.

settings `property`

settings: ScientistSettings

Return current settings.

analyze_leaderboard `async`

analyze_leaderboard(
    ctx: RunContext[ScientistDeps],
    refresh: Annotated[
        bool,
        Doc(
            "Whether to refresh leaderboard before analysis."
        ),
    ] = True,
) -> dict[str, Any]

Analyze the current competition leaderboard.

@notice: | Computes distribution stats and common approaches from leaderboard entries.

@effects: io: - Kaggle API request (optional) state: - ctx.deps.leaderboard

Source code in agent_k/agents/scientist.py

async def analyze_leaderboard(
    self,
    ctx: RunContext[ScientistDeps],
    refresh: Annotated[bool, Doc("Whether to refresh leaderboard before analysis.")] = True,
) -> dict[str, Any]:
    """Analyze the current competition leaderboard.

    @notice: |
        Computes distribution stats and common approaches from leaderboard entries.

    @effects:
        io:
            - Kaggle API request (optional)
        state:
            - ctx.deps.leaderboard
    """
    with logfire.span("scientist.analyze_leaderboard"):
        if refresh:
            await ctx.deps.refresh_leaderboard()

        leaderboard = ctx.deps.leaderboard
        if not leaderboard:
            return {"error": "No leaderboard data available"}

        scores = [e.score for e in leaderboard]
        return {
            "total_teams": len(leaderboard),
            "top_score": max(scores),
            "median_score": sorted(scores)[len(scores) // 2],
            "score_range": max(scores) - min(scores),
            "top_10_scores": [e.score for e in leaderboard[:10]],
            "top_teams": [{"rank": e.rank, "team": e.team_name, "score": e.score} for e in leaderboard[:10]],
        }

get_kaggle_notebooks `async`

get_kaggle_notebooks(
    ctx: RunContext[ScientistDeps],
    sort_by: Annotated[
        str, Doc("Sort key for Kaggle notebooks.")
    ] = "voteCount",
    max_results: Annotated[
        int,
        Doc("Maximum notebooks to retrieve."),
        Range(1, 50),
    ] = 10,
) -> list[dict[str, Any]]

Get top notebooks for the competition.

@notice: | Retrieves competition notebooks ordered by vote count or freshness.

@effects: io: - Kaggle API request

Source code in agent_k/agents/scientist.py

async def get_kaggle_notebooks(
    self,
    ctx: RunContext[ScientistDeps],
    sort_by: Annotated[str, Doc("Sort key for Kaggle notebooks.")] = "voteCount",
    max_results: Annotated[int, Doc("Maximum notebooks to retrieve."), Range(1, 50)] = 10,
) -> list[dict[str, Any]]:
    """Get top notebooks for the competition.

    @notice: |
        Retrieves competition notebooks ordered by vote count or freshness.

    @effects:
        io:
            - Kaggle API request
    """
    with logfire.span("scientist.get_notebooks"):
        notebooks = await self._fetch_kaggle_notebooks(ctx, sort_by=sort_by, max_results=max_results)
        if notebooks:
            return notebooks

        await ctx.deps.refresh_leaderboard()
        return [
            {
                "title": f"{ctx.deps.competition.title} solution by {entry.team_name}",
                "votes": max(1, (len(ctx.deps.leaderboard) - entry.rank + 1) * 5),
                "author": entry.team_name,
                "techniques": self._infer_techniques_from_text(" ".join(ctx.deps.competition.tags)),
            }
            for entry in ctx.deps.leaderboard[:max_results]
        ]

analyze_top_kernels `async`

analyze_top_kernels(
    ctx: RunContext[ScientistDeps],
    sort_by: Annotated[
        str, Doc("Sort key for Kaggle kernels.")
    ] = "voteCount",
    max_results: Annotated[
        int,
        Doc("Maximum kernels to analyze."),
        Range(1, 25),
    ] = 5,
) -> dict[str, Any]

Analyze top Kaggle kernels for techniques and patterns.

@notice: | Downloads and summarizes kernel code to identify techniques.

@effects: io: - Kaggle API request

Source code in agent_k/agents/scientist.py

async def analyze_top_kernels(
    self,
    ctx: RunContext[ScientistDeps],
    sort_by: Annotated[str, Doc("Sort key for Kaggle kernels.")] = "voteCount",
    max_results: Annotated[int, Doc("Maximum kernels to analyze."), Range(1, 25)] = 5,
) -> dict[str, Any]:
    """Analyze top Kaggle kernels for techniques and patterns.

    @notice: |
        Downloads and summarizes kernel code to identify techniques.

    @effects:
        io:
            - Kaggle API request
    """
    with logfire.span("scientist.analyze_top_kernels"):
        notebooks = await self._fetch_kaggle_notebooks(ctx, sort_by=sort_by, max_results=max_results)
        analyses: list[dict[str, Any]] = []
        for notebook in notebooks[:max_results]:
            kernel_ref = notebook.get("ref") or self._extract_kernel_ref(notebook.get("url", ""))
            code = await self._fetch_kernel_code(ctx, kernel_ref) if kernel_ref else None
            extracted = (
                self._extract_techniques_from_code(code)
                if code
                else {"techniques": notebook.get("techniques", []), "target_transforms": [], "stacking": []}
            )
            analyses.append(
                {
                    "title": notebook.get("title"),
                    "author": notebook.get("author"),
                    "votes": notebook.get("votes"),
                    "url": notebook.get("url"),
                    "ref": kernel_ref,
                    "techniques": extracted["techniques"],
                    "target_transforms": extracted["target_transforms"],
                    "stacking": extracted["stacking"],
                    "code_available": bool(code),
                }
            )
        return {"kernels": analyses, "summary": self._summarize_kernel_analysis(analyses)}

extract_techniques `async`

extract_techniques(
    ctx: RunContext[ScientistDeps],
    kernel_code: Annotated[
        str, Doc("Raw kernel code to analyze.")
    ],
) -> dict[str, Any]

Extract modeling techniques from Kaggle kernel code.

@notice: | Scans code for common ML techniques and feature engineering patterns.

@effects: state: - none

Source code in agent_k/agents/scientist.py

async def extract_techniques(
    self, ctx: RunContext[ScientistDeps], kernel_code: Annotated[str, Doc("Raw kernel code to analyze.")]
) -> dict[str, Any]:
    """Extract modeling techniques from Kaggle kernel code.

    @notice: |
        Scans code for common ML techniques and feature engineering patterns.

    @effects:
        state:
            - none
    """
    _ = ctx
    return self._extract_techniques_from_code(kernel_code)

synthesize_strategy `async`

synthesize_strategy(
    ctx: RunContext[ScientistDeps],
    techniques: Annotated[
        list[str] | None,
        Doc("Extracted modeling techniques."),
    ] = None,
    target_transforms: Annotated[
        list[str] | None,
        Doc("Candidate target transforms."),
    ] = None,
    stacking: Annotated[
        list[str] | None, Doc("Stacking or blending hints.")
    ] = None,
) -> dict[str, Any]

Synthesize a prioritized strategy from extracted techniques.

@notice: | Aggregates extracted techniques into prioritized action items.

@effects: state: - none

Source code in agent_k/agents/scientist.py

async def synthesize_strategy(
    self,
    ctx: RunContext[ScientistDeps],
    techniques: Annotated[list[str] | None, Doc("Extracted modeling techniques.")] = None,
    target_transforms: Annotated[list[str] | None, Doc("Candidate target transforms.")] = None,
    stacking: Annotated[list[str] | None, Doc("Stacking or blending hints.")] = None,
) -> dict[str, Any]:
    """Synthesize a prioritized strategy from extracted techniques.

    @notice: |
        Aggregates extracted techniques into prioritized action items.

    @effects:
        state:
            - none
    """
    _ = ctx
    techniques = techniques or []
    target_transforms = target_transforms or []
    stacking = stacking or []
    plan: list[dict[str, Any]] = []
    priority = 1

    if target_transforms:
        plan.append(
            {
                "priority": priority,
                "action": "target_transform",
                "details": f"Evaluate {', '.join(sorted(set(target_transforms)))} for the target.",
            }
        )
        priority += 1

    if stacking or "stacking" in techniques or "blending" in techniques:
        plan.append(
            {
                "priority": priority,
                "action": "stacking_blending",
                "details": "Prototype a simple stacking/blending ensemble with diverse base models.",
            }
        )
        priority += 1

    if "feature_scaling" in techniques or "polynomial_features" in techniques or "binning" in techniques:
        plan.append(
            {
                "priority": priority,
                "action": "feature_engineering",
                "details": "Prioritize scaling, polynomial interactions, and binning for numeric features.",
            }
        )
        priority += 1

    if "lightgbm" in techniques:
        plan.append(
            {
                "priority": priority,
                "action": "lightgbm_objectives",
                "details": "Tune LightGBM objectives with quantile or huber settings for robustness.",
            }
        )
        priority += 1

    return {"plan": plan, "signals": {"techniques": techniques, "stacking": stacking}}

analyze_data_characteristics `async`

analyze_data_characteristics(
    ctx: RunContext[ScientistDeps],
) -> dict[str, Any]

Analyze competition data characteristics.

@notice: | Profiles datasets and derives preprocessing hints.

@effects: io: - Kaggle API request - local filesystem access

Source code in agent_k/agents/scientist.py

async def analyze_data_characteristics(self, ctx: RunContext[ScientistDeps]) -> dict[str, Any]:
    """Analyze competition data characteristics.

    @notice: |
        Profiles datasets and derives preprocessing hints.

    @effects:
        io:
            - Kaggle API request
            - local filesystem access
    """
    with logfire.span("scientist.analyze_data"):
        try:
            with tempfile.TemporaryDirectory() as tmp_dir:
                files = await ctx.deps.platform_adapter.download_data(ctx.deps.competition.id, tmp_dir)
                summary = self._summarize_dataset(files)
                profile: DatasetProfile | None = None
                try:
                    train_path, test_path, sample_path = locate_data_files(files)
                    profile = build_dataset_profile(train_path, test_path, sample_path)
                    hints = generate_preprocessing_hints(profile, ctx.deps.competition.id)
                    summary["dataset_profile"] = profile.to_dict()
                    summary["preprocessing_hints"] = [hint.to_dict() for hint in hints]
                except Exception as exc:
                    logfire.warning("dataset_profile_failed", error=str(exc))
                try:
                    external_data = await get_external_data_policy(
                        ctx.deps.http_client,
                        ctx.deps.competition.id,
                        profile=profile,
                        cache=ctx.deps.research_cache,
                    )
                    summary["external_data_rules"] = external_data
                except Exception as exc:
                    logfire.warning("external_data_rules_failed", error=str(exc))
                return summary
        except Exception as exc:
            logfire.warning("data_analysis_failed", error=str(exc))
            return self._fallback_dataset_summary(ctx.deps.competition)

check_external_data_rules `async`

check_external_data_rules(
    ctx: RunContext[ScientistDeps],
    profile: Annotated[
        dict[str, Any] | None,
        Doc("Dataset profile snapshot, if available."),
    ] = None,
) -> dict[str, Any]

Check competition rules for external data usage.

@notice: | Fetches rules and infers external data policy.

@effects: io: - Kaggle API request

Source code in agent_k/agents/scientist.py

async def check_external_data_rules(
    self,
    ctx: RunContext[ScientistDeps],
    profile: Annotated[dict[str, Any] | None, Doc("Dataset profile snapshot, if available.")] = None,
) -> dict[str, Any]:
    """Check competition rules for external data usage.

    @notice: |
        Fetches rules and infers external data policy.

    @effects:
        io:
            - Kaggle API request
    """
    dataset_profile = DatasetProfile.from_dict(profile) if isinstance(profile, dict) else None
    return await get_external_data_policy(
        ctx.deps.http_client, ctx.deps.competition.id, profile=dataset_profile, cache=ctx.deps.research_cache
    )

compute_baseline_estimate `async`

compute_baseline_estimate(
    ctx: RunContext[ScientistDeps],
    leaderboard_scores: Annotated[
        list[float], Doc("Leaderboard scores to summarize.")
    ],
    competition_difficulty: Annotated[
        str, Doc("Difficulty label (easy/medium/hard).")
    ],
) -> float

Estimate achievable baseline score.

@notice: | Uses leaderboard median and a difficulty multiplier.

@effects: state: - none

Source code in agent_k/agents/scientist.py

async def compute_baseline_estimate(
    self,
    ctx: RunContext[ScientistDeps],
    leaderboard_scores: Annotated[list[float], Doc("Leaderboard scores to summarize.")],
    competition_difficulty: Annotated[str, Doc("Difficulty label (easy/medium/hard).")],
) -> float:
    """Estimate achievable baseline score.

    @notice: |
        Uses leaderboard median and a difficulty multiplier.

    @effects:
        state:
            - none
    """
    _ = ctx
    if not leaderboard_scores:
        return 0.0

    median = sorted(leaderboard_scores)[len(leaderboard_scores) // 2]
    difficulty_multiplier = {"easy": 0.95, "medium": 0.85, "hard": 0.70}.get(competition_difficulty, 0.80)

    return median * difficulty_multiplier

get_external_data_policy `async`

get_external_data_policy(
    http_client: Annotated[
        AsyncClient, Doc("HTTP client for rule fetching.")
    ],
    competition_id: Annotated[
        str, Doc("Competition identifier (slug).")
    ],
    *,
    profile: Annotated[
        DatasetProfile | None,
        Doc(
            "Optional dataset profile for enrichment hints."
        ),
    ] = None,
    cache: Annotated[
        dict[str, Any] | None,
        Doc("Optional cache for policy results."),
    ] = None,
) -> dict[str, Any]

Fetch and parse competition rules for external data usage.

@notice: | Retrieves competition rules and infers external data allowance.

@dev: | Uses simple heuristics and caches results when provided.

@effects: io: - Kaggle API request state: - cache (optional)

Source code in agent_k/agents/scientist.py

async def get_external_data_policy(
    http_client: Annotated[httpx.AsyncClient, Doc("HTTP client for rule fetching.")],
    competition_id: Annotated[str, Doc("Competition identifier (slug).")],
    *,
    profile: Annotated[DatasetProfile | None, Doc("Optional dataset profile for enrichment hints.")] = None,
    cache: Annotated[dict[str, Any] | None, Doc("Optional cache for policy results.")] = None,
) -> dict[str, Any]:
    """Fetch and parse competition rules for external data usage.

    @notice: |
        Retrieves competition rules and infers external data allowance.

    @dev: |
        Uses simple heuristics and caches results when provided.

    @effects:
        io:
            - Kaggle API request
        state:
            - cache (optional)
    """
    cache_key = f"external_data_policy:{competition_id}"
    if cache is not None:
        cached = cache.get(cache_key)
        if isinstance(cached, dict):
            return dict(cached)

    rules_url = _KAGGLE_RULES_ENDPOINT.format(competition_id=competition_id)
    rules_text = ""
    try:
        response = await http_client.get(rules_url)
        if response.status_code == 200:
            rules_text = response.text
    except httpx.HTTPError as exc:
        logfire.warning("external_data_rules_fetch_failed", error=str(exc), competition_id=competition_id)

    allowed, restrictions = _parse_external_data_policy(rules_text)
    recommended = _suggest_enrichment_sources(profile) if allowed else []
    payload = {
        "external_data_allowed": allowed,
        "restrictions": restrictions,
        "recommended_sources": recommended,
        "rules_url": rules_url,
    }
    if cache is not None:
        cache[cache_key] = payload
    return payload

agent_k.agents.scientist