google.cloud.gcp_vertexai_index_endpoint_deployed_index module – Creates a GCP VertexAI.IndexEndpointDeployedIndex resource

Note

This module is part of the google.cloud collection (version 1.13.0).

You might already have this collection installed if you are using the ansible package. It is not included in ansible-core. To check whether it is installed, run ansible-galaxy collection list.

To install it, use: ansible-galaxy collection install google.cloud. You need further requirements to be able to use this module, see Requirements for details.

To use it in a playbook, specify: google.cloud.gcp_vertexai_index_endpoint_deployed_index.

Synopsis 

An endpoint indexes are deployed into. An index endpoint can have multiple deployed indexes.

Requirements 

The below requirements are needed on the host that executes this module.

python >= 3.8
requests >= 2.18.4
google-auth >= 2.25.1

Parameters 

Parameter	Comments
access_token string	The access token used to authenticate.
auth_kind string / required	The type of credential used. Choices: `"accesstoken"` `"application"` `"machineaccount"` `"serviceaccount"`
automatic_resources dictionary	A description of resources that the DeployedIndex uses, which to large degree are decided by Vertex AI, and optionally allows only a modest additional configuration.
max_replica_count integer	The maximum number of replicas this DeployedModel may be deployed on when the traffic against it increases. If maxReplicaCount is not set, the default value is minReplicaCount. The max allowed replica count is 1000. The maximum number of replicas this DeployedModel may be deployed on when the traffic against it increases. If the requested value is too large, the deployment will error, but if deployment succeeds then the ability to scale the model to that many replicas is guaranteed (barring service outages). If traffic against the DeployedModel increases beyond what its replicas at maximum may handle, a portion of the traffic will be dropped. If this value is not provided, a no upper bound for scaling under heavy traffic will be assume, though Vertex AI may be unable to scale beyond certain replica number.
min_replica_count integer	The minimum number of replicas this DeployedModel will be always deployed on. If minReplicaCount is not set, the default value is 2 (we don’t provide SLA when minReplicaCount=1). If traffic against it increases, it may dynamically be deployed onto more replicas up to [maxReplicaCount](https://cloud.google.com/vertex-ai/docs/reference/rest/v1/AutomaticResources#FIELDS.max_replica_count), and as traffic decreases, some of these extra replicas may be freed. If the requested value is too large, the deployment will error.
dedicated_resources dictionary	A description of resources that are dedicated to the DeployedIndex, and that need a higher degree of manual configuration. The field minReplicaCount must be set to a value strictly greater than 0, or else validation will fail. We don’t provide SLA when minReplicaCount=1. If maxReplicaCount is not set, the default value is minReplicaCount. The max allowed replica count is 1000. Available machine types for SMALL shard: e2-standard-2 and all machine types available for MEDIUM and LARGE shard. Available machine types for MEDIUM shard: e2-standard-16 and all machine types available for LARGE shard. Available machine types for LARGE shard: e2-highmem-16, n2d-standard-32. n1-standard-16 and n1-standard-32 are still available, but we recommend e2-standard-16 and e2-highmem-16 for cost efficiency.
machine_spec dictionary / required	The minimum number of replicas this DeployedModel will be always deployed on.
machine_type string	The type of the machine. See the [list of machine types supported for prediction](https://cloud.google.com/vertex-ai/docs/predictions/configure-compute#machine-types) See the [list of machine types supported for custom training](https://cloud.google.com/vertex-ai/docs/training/configure-compute#machine-types). For [DeployedModel](https://cloud.google.com/vertex-ai/docs/reference/rest/v1/projects.locations.endpoints#DeployedModel) this field is optional, and the default value is n1-standard-2. For [BatchPredictionJob](https://cloud.google.com/vertex-ai/docs/reference/rest/v1/projects.locations.batchPredictionJobs#BatchPredictionJob) or as part of [WorkerPoolSpec](https://cloud.google.com/vertex-ai/docs/reference/rest/v1/CustomJobSpec#WorkerPoolSpec) this field is required.
max_replica_count integer	The maximum number of replicas this DeployedModel may be deployed on when the traffic against it increases. If maxReplicaCount is not set, the default value is minReplicaCount.
min_replica_count integer / required	The minimum number of machine replicas this DeployedModel will be always deployed on. This value must be greater than or equal to 1.
deployed_index_auth_config dictionary	If set, the authentication is enabled for the private endpoint.
auth_provider dictionary	Defines the authentication provider that the DeployedIndex uses.
allowed_issuers list / elements=string	A list of allowed JWT issuers. Each entry must be a valid Google service account, in the following format: service-account-name@project-id.iam.gserviceaccount.com.
audiences list / elements=string	The list of JWT audiences. that are allowed to access. A JWT containing any of these audiences will be accepted.
deployed_index_id string / required	The user specified ID of the DeployedIndex. The ID can be up to 128 characters long and must start with a letter and only contain letters, numbers, and underscores. The ID must be unique within the project it is created in.
deployment_group string	The deployment group can be no longer than 64 characters (eg: ‘test’, ‘prod’). If not set, we will use the ‘default’ deployment group. Creating deployment_groups with reserved_ip_ranges is a recommended practice when the peered network has multiple peering ranges. This creates your deployments from predictable IP spaces for easier traffic administration. Also, one deployment_group (except ‘default’) can only be used with the same reserved_ip_ranges which means if the deployment_group has been used with reserved_ip_ranges: [a, b, c], using it with [a, b] or [d, e] is disallowed. [See the official documentation here](https://cloud.google.com/vertex-ai/docs/reference/rest/v1/projects.locations.indexEndpoints#DeployedIndex.FIELDS.deployment_group). Note: we only support up to 5 deployment groups (not including ‘default’). Default: `"default"`
display_name string	The display name of the Index. The name can be up to 128 characters long and can consist of any UTF-8 characters.
enable_access_logging boolean	If true, private endpoint’s access logs are sent to Cloud Logging. Choices: `false` ← (default) `true`
env_type string	Specifies which Ansible environment you’re running this module within. This should not be set unless you know what you’re doing. This only alters the User Agent string for any API requests.
index string / required	The name of the Index this is the deployment of.
index_endpoint dictionary / required	Identifies the index endpoint. Must be in the format ‘projects/{{project}}/locations/{{region}}/indexEndpoints/{{indexEndpoint}}’. This field is a reference to a IndexEndpoint resource in GCP. It can be specified in two ways: First, you can place a dictionary with key ‘name’ matching your resource. Alternatively, you can add `register: name-of-resource` to a IndexEndpoint task and then set this field to `{{ name-of-resource }}`.
project string	The Google Cloud Platform project to use.
region string / required	The region of the index. eg us-central1.
reserved_ip_ranges list / elements=string	A list of reserved ip ranges under the VPC network that can be used for this DeployedIndex. If set, we will deploy the index within the provided ip ranges. Otherwise, the index might be deployed to any ip ranges under the provided VPC network. The value should be the name of the address (https://cloud.google.com/compute/docs/reference/rest/v1/addresses) Example: [‘vertex-ai-ip-range’]. For more information about subnets and network IP ranges, please see https://cloud.google.com/vpc/docs/subnets#manually_created_subnet_ip_ranges.
scopes list / elements=string	Array of scopes to be used.
service_account_contents jsonarg	The contents of a Service Account JSON file, either in a dictionary or as a JSON string that represents it.
service_account_email string	An optional service account email address if machineaccount is selected and the user does not wish to use the default email.
service_account_file path	The path of a Service Account JSON file if serviceaccount is selected as type.
state string	Whether the resource should exist in GCP. Choices: `"present"` ← (default) `"absent"`

Notes 

Note

API Reference: https://cloud.google.com/vertex-ai/docs/reference/rest/v1/projects.locations.indexEndpoints#DeployedIndex
For authentication, you can set auth_kind using the GCP_AUTH_KIND env variable.
For authentication, you can set service_account_file using the GCP_SERVICE_ACCOUNT_FILE env variable.
For authentication, you can set service_account_contents using the GCP_SERVICE_ACCOUNT_CONTENTS env variable.
For authentication, you can set service_account_email using the GCP_SERVICE_ACCOUNT_EMAIL env variable.
For authentication, you can set access_token using the GCP_ACCESS_TOKEN env variable.
For authentication, you can set scopes using the GCP_SCOPES env variable.
Environment variables values will only be used if the playbook values are not set.
The service_account_email, service_account_file, service_account_file and access_token options are mutually exclusive.

Examples 

- name: Create basic index endpoint deployed index
  google.cloud.gcp_vertexai_index_endpoint_deployed_index:
    state: present
    display_name: "{{ resource_name }}"
    deployed_index_id: "{{ resource_name | regex_replace('-', '_') }}"
    region: us-central1
    index: "{{ _myidx.name }}"
    index_endpoint: "{{ _myidxep.name }}"
    enable_access_logging: false
    deployed_index_auth_config:
      auth_provider:
        audiences:
          - 123-myapp
        allowed_issuers:
          - mysa@myproject.iam.gserviceaccount.com
    project: "{{ gcp_project }}"
    auth_kind: "{{ gcp_cred_kind }}"
    service_account_file: "{{ gcp_cred_file }}"

################################################################################

- name: Create index endpoint deployed index with dedicated resources
  google.cloud.gcp_vertexai_index_endpoint_deployed_index:
    state: present
    display_name: "{{ resource_name }}"
    deployed_index_id: "{{ resource_name | regex_replace('-', '_') }}"
    region: us-central1
    index: "{{ _myidx.name }}"
    index_endpoint: "{{ _myidxep.name }}"
    enable_access_logging: false
    deployed_index_auth_config:
      auth_provider:
        audiences:
          - 123-myapp
        allowed_issuers:
          - mysa@myproject.iam.gserviceaccount.com
    dedicated_resources:
      min_replica_count: 1
      max_replica_count: 3
      machine_spec:
        machine_type: e2-standard-2
    project: "{{ gcp_project }}"
    auth_kind: "{{ gcp_cred_kind }}"
    service_account_file: "{{ gcp_cred_file }}"

################################################################################

- name: Create index endpoint deployed index with automatic resources
  google.cloud.gcp_vertexai_index_endpoint_deployed_index:
    state: present
    display_name: "{{ resource_name }}"
    deployed_index_id: "{{ resource_name | regex_replace('-', '_') }}"
    region: us-central1
    index: "{{ _myidx.name }}"
    index_endpoint: "{{ _myidxep.name }}"
    enable_access_logging: false
    deployed_index_auth_config:
      auth_provider:
        audiences:
          - 123-myapp
        allowed_issuers:
          - mysa@myproject.iam.gserviceaccount.com
    automatic_resources:
      max_replica_count: 3
    project: "{{ gcp_project }}"
    auth_kind: "{{ gcp_cred_kind }}"
    service_account_file: "{{ gcp_cred_file }}"

Return Values 

Common return values are documented here, the following are the fields unique to this module:

Key	Description
changed boolean	Whether the resource was changed. Returned: always
createTime string	The timestamp of when the Index was created in RFC3339 UTC “Zulu” format, with nanosecond resolution and up to nine fractional digits. Returned: success
indexSyncTime string	The DeployedIndex may depend on various data on its original Index. Additionally when certain changes to the original Index are being done (e.g. when what the Index contains is being changed) the DeployedIndex may be asynchronously updated in the background to reflect these changes. If this timestamp’s value is at least the [Index.update_time](https://cloud.google.com/vertex-ai/docs/reference/rest/v1/projects.locations.indexes#Index.FIELDS.update_time) of the original Index, it means that this DeployedIndex and the original Index are in sync. If this timestamp is older, then to see which updates this DeployedIndex already contains (and which it does not), one must [list](https://cloud.google.com/vertex-ai/docs/reference/rest/v1beta1/projects.locations.operations/list#google.longrunning.Operations.ListOperations) the operations that are running on the original Index. Only the successfully completed Operations with updateTime equal or before this sync time are contained in this DeployedIndex. A timestamp in RFC3339 UTC “Zulu” format, with nanosecond resolution and up to nine fractional digits. Examples: “2014-10-02T15:01:23Z” and “2014-10-02T15:01:23.045123456Z”. Returned: success
name string	The name of the DeployedIndex resource. Returned: success
privateEndpoints dictionary	Provides paths for users to send requests directly to the deployed index services running on Cloud via private services access. This field is populated if [network](https://cloud.google.com/vertex-ai/docs/reference/rest/v1/projects.locations.indexEndpoints#IndexEndpoint.FIELDS.network) is configured. Returned: success
matchGrpcAddress string	The ip address used to send match gRPC requests. Returned: success
pscAutomatedEndpoints list / elements=dictionary	PscAutomatedEndpoints is populated if private service connect is enabled if PscAutomatedConfig is set. Returned: success
matchAddress string	ip Address created by the automated forwarding rule. Returned: success
network string	Corresponding network in pscAutomationConfigs. Returned: success
projectId string	Corresponding projectId in pscAutomationConfigs. Returned: success
serviceAttachment string	The name of the service attachment resource. Populated if private service connect is enabled. Returned: success
state string	The current state of the resource. Returned: always

Authors

Google Inc. (@googlecloudplatform)

google.cloud.gcp_vertexai_index_endpoint_deployed_index module – Creates a GCP VertexAI.IndexEndpointDeployedIndex resource

Synopsis

Requirements

Parameters

Notes

Examples

Return Values