# アーキテクチャ構成図

## システム全体構成図

```mermaid
graph TB
    subgraph "Python API 層"
        PY_KERAS["tf.keras<br/>(高水準 API)"]
        PY_EAGER["tf.eager<br/>(Eager Execution)"]
        PY_DATA["tf.data<br/>(データパイプライン)"]
        PY_DISTRIBUTE["tf.distribute<br/>(分散学習戦略)"]
        PY_SAVED["tf.saved_model<br/>(モデル保存・復元)"]
        PY_AUTOGRAPH["tf.autograph<br/>(Python->Graph 変換)"]
        PY_PROFILER["tf.profiler<br/>(プロファイリング)"]
    end

    subgraph "C API 層"
        C_API["C API (c_api.h)"]
        C_EAGER["Eager C API"]
    end

    subgraph "C++ API 層"
        CC_CLIENT["Client / Session"]
        CC_OPS["C++ Ops"]
        CC_GRAD["Gradients"]
    end

    subgraph "Core Runtime 層"
        FRAMEWORK["Framework<br/>(Tensor, Op, Device)"]
        COMMON_RT["Common Runtime<br/>(Executor, Session, DeviceMgr)"]
        GRAPH["Graph<br/>(構築・変換)"]
        KERNELS["Kernels<br/>(CPU/GPU Op 実装)"]
        OPS_REG["Ops Registry<br/>(Op 定義・登録)"]
        GRAPPLER["Grappler<br/>(グラフ最適化)"]
        DIST_RT["Distributed Runtime<br/>(gRPC, Rendezvous)"]
    end

    subgraph "コンパイラ層"
        JIT["JIT<br/>(XLA クラスタリング)"]
        TF2XLA["tf2xla<br/>(Op -> HLO 変換)"]
        MLIR["MLIR<br/>(変換パイプライン)"]
        TF2TRT["tf2tensorrt<br/>(TensorRT 統合)"]
        AOT["AOT<br/>(事前コンパイル)"]
    end

    subgraph "Platform 層"
        PLATFORM["Platform Abstraction<br/>(OS, ファイルシステム)"]
        LIB["Core Lib<br/>(文字列, IO, ハッシュ)"]
        PROTOBUF["Protocol Buffers<br/>(シリアライズ)"]
    end

    subgraph "TensorFlow Lite 層"
        LITE_CORE["Lite Runtime Core"]
        LITE_KERNELS["Lite Kernels"]
        LITE_DELEGATES["Delegates<br/>(GPU, NNAPI, CoreML)"]
    end

    subgraph "外部連携"
        GPU["GPU<br/>(CUDA / ROCm)"]
        TPU["TPU Runtime"]
        TENSORBOARD["TensorBoard"]
        STORAGE["Storage<br/>(ローカル, GCS, S3)"]
        MOBILE["Mobile / Edge<br/>(Android, iOS)"]
    end

    PY_KERAS --> PY_EAGER
    PY_KERAS --> PY_DATA
    PY_KERAS --> PY_DISTRIBUTE
    PY_EAGER --> C_EAGER
    PY_AUTOGRAPH --> GRAPH
    PY_SAVED --> C_API
    PY_DATA --> KERNELS
    PY_DISTRIBUTE --> DIST_RT
    PY_PROFILER --> TENSORBOARD

    C_API --> FRAMEWORK
    C_API --> COMMON_RT
    C_EAGER --> COMMON_RT

    CC_CLIENT --> C_API
    CC_OPS --> OPS_REG
    CC_GRAD --> KERNELS

    COMMON_RT --> FRAMEWORK
    COMMON_RT --> GRAPH
    COMMON_RT --> KERNELS
    COMMON_RT --> GRAPPLER
    GRAPH --> OPS_REG
    KERNELS --> FRAMEWORK

    GRAPPLER --> GRAPH
    COMMON_RT --> JIT
    JIT --> TF2XLA
    JIT --> MLIR
    TF2XLA --> KERNELS
    MLIR --> GRAPH
    TF2TRT --> GRAPH

    DIST_RT --> COMMON_RT

    FRAMEWORK --> PLATFORM
    FRAMEWORK --> LIB
    FRAMEWORK --> PROTOBUF
    COMMON_RT --> PLATFORM

    LITE_CORE --> LITE_KERNELS
    LITE_CORE --> LITE_DELEGATES

    KERNELS --> GPU
    KERNELS --> TPU
    DIST_RT --> GPU
    LITE_DELEGATES --> MOBILE
    COMMON_RT --> STORAGE
    PY_SAVED --> STORAGE
```

## レイヤー依存関係図

```mermaid
graph TD
    A["Python API 層<br/>tensorflow/python/"] --> B["C API 層<br/>tensorflow/c/"]
    A --> C["C++ API 層<br/>tensorflow/cc/"]
    B --> D["Core Runtime 層<br/>tensorflow/core/"]
    C --> B
    D --> E["Platform 層<br/>tensorflow/core/platform/, lib/"]
    F["コンパイラ層<br/>tensorflow/compiler/"] --> D
    D --> F
    G["TF Lite 層<br/>tensorflow/lite/"] -.->|変換ツール経由| D

    style A fill:#4a90d9,color:#fff
    style B fill:#7cb342,color:#fff
    style C fill:#7cb342,color:#fff
    style D fill:#ff8f00,color:#fff
    style E fill:#8e24aa,color:#fff
    style F fill:#e53935,color:#fff
    style G fill:#00897b,color:#fff
```

## データフロー図 (Eager Execution)

```mermaid
sequenceDiagram
    participant User as ユーザーコード
    participant PyAPI as Python API
    participant CAPI as C Eager API
    participant Runtime as Common Runtime
    participant DevMgr as Device Manager
    participant Kernel as Op Kernel
    participant Device as Device (CPU/GPU)

    User->>PyAPI: tf.matmul(a, b)
    PyAPI->>CAPI: EagerExecute("MatMul", inputs)
    CAPI->>Runtime: Op ディスパッチ
    Runtime->>DevMgr: デバイス選択
    DevMgr-->>Runtime: 選択デバイス
    Runtime->>Kernel: カーネルルックアップ & 実行
    Kernel->>Device: 計算実行
    Device-->>Kernel: 結果テンソル
    Kernel-->>Runtime: 出力テンソル
    Runtime-->>CAPI: TFE_TensorHandle
    CAPI-->>PyAPI: EagerTensor
    PyAPI-->>User: tf.Tensor
```

## データフロー図 (Graph Execution / tf.function)

```mermaid
sequenceDiagram
    participant User as ユーザーコード
    participant TFFunc as tf.function
    participant Tracer as Tracing Engine
    participant Grappler as Grappler 最適化
    participant JIT as XLA JIT (optional)
    participant Executor as Executor
    participant Kernel as Op Kernel

    User->>TFFunc: 関数呼び出し
    TFFunc->>Tracer: Python コードトレース
    Tracer-->>TFFunc: FuncGraph (ConcreteFunction)
    TFFunc->>Grappler: グラフ最適化
    Grappler-->>TFFunc: 最適化済みグラフ
    TFFunc->>JIT: XLA コンパイル (有効時)
    JIT-->>TFFunc: コンパイル済みカーネル
    TFFunc->>Executor: グラフ実行
    Executor->>Kernel: ノード毎にカーネル実行
    Kernel-->>Executor: 出力テンソル
    Executor-->>TFFunc: 最終結果
    TFFunc-->>User: tf.Tensor
```

## 分散学習構成図

```mermaid
graph TB
    subgraph "Worker 0 (Chief)"
        W0_STRATEGY["tf.distribute.Strategy"]
        W0_RUNTIME["Common Runtime"]
        W0_DEVICE["GPU 0, GPU 1"]
    end

    subgraph "Worker 1"
        W1_RUNTIME["Common Runtime"]
        W1_DEVICE["GPU 0, GPU 1"]
    end

    subgraph "Worker 2"
        W2_RUNTIME["Common Runtime"]
        W2_DEVICE["GPU 0, GPU 1"]
    end

    subgraph "通信レイヤー"
        GRPC["gRPC"]
        NCCL["NCCL AllReduce"]
        RENDEZVOUS["Rendezvous"]
    end

    W0_STRATEGY --> W0_RUNTIME
    W0_RUNTIME --> W0_DEVICE
    W1_RUNTIME --> W1_DEVICE
    W2_RUNTIME --> W2_DEVICE

    W0_RUNTIME <--> GRPC
    W1_RUNTIME <--> GRPC
    W2_RUNTIME <--> GRPC

    W0_DEVICE <--> NCCL
    W1_DEVICE <--> NCCL
    W2_DEVICE <--> NCCL

    GRPC <--> RENDEZVOUS
```
