Entities¶
This section introduces the different entities that can be created and stored in the geoh5
file format.
Groups¶
Groups
are effectively containers for other entities, such as Objects
(Points, Curve, Surface, etc.) and other Groups
. Groups are used to establish parent-child
relationships and to store information about a collection of entities.
RootGroup¶
By default, the parent of any new Entity
is the workspace RootGroup
. It is the only entity in the Workspace
without a parent. Users rarely have to interect with the Root
group as it is mainly used to maintain the overall project hierarchy.
ContainerGroup¶
A ContainerGroup
can easily be added to the workspace and can be assigned a name
and description
.
[1]:
from geoh5py.groups import ContainerGroup
from geoh5py.workspace import Workspace
# Create a blank project
workspace = Workspace("my_project.geoh5")
# Add a group
group = ContainerGroup.create(workspace, name='myGroup')
At creation, "myGroup"
is written to the project geoh5
file and visible in the Analyst project tree.
Any entity can be accessed by its name
or uid
(unique identifier):
[2]:
print(group.uid)
print(workspace.get_entity("myGroup")[0] == workspace.get_entity(group.uid)[0])
7e3f2f5a-c916-496e-be38-243cff3721cd
True
Objects¶
The geoh5
format enables storing a wide variety of Object
entities that can be displayed in 3D. This section describes the collection of Objects
entities currently supported by geoh5py
.
Points¶
The Points
object consists of a list of vertices
that define the location of actual data in 3D space. As for all other Objects
, it can be created from an array of 3D coordinates and added to any group as follow:
[3]:
from geoh5py.workspace import Workspace
from geoh5py.objects import Points
import numpy as np
# Create a blank project
workspace = Workspace("my_project.geoh5")
# Generate a numpy array of xyz locations
n = 100
radius, theta = np.arange(n), np.linspace(0, np.pi*8, n)
x, y = radius * np.cos(theta), radius * np.sin(theta)
z = (x**2. + y**2.)**0.5
xyz = np.c_[x.ravel(), y.ravel(), z.ravel()] # Form a 2D array
# Create the Point object
points = Points.create(
workspace, # The target Workspace
vertices=xyz # Set vertices
)
Curve¶
The Curve
object, also known as a polyline, is often used to define contours, survey lines or geological contacts. It is a sub-class of the Points
object with the added cells
property, that defines the line segments connecting its vertices
. By default, all vertices are connected sequentially following the order of the input vertices
.
[4]:
from geoh5py.objects import Curve
# Create the Curve object
curve = Curve.create(
workspace, # The target Workspace
vertices=xyz
)
Alternatively, the cells
property can be modified, either directly or by assigning parts
identification to each vertices
:
[5]:
# Split the curve into two parts
part_id = np.ones(n, dtype="int32")
part_id[:75] = 2
# Assign the part
curve.parts = part_id
workspace.finalize()
Drillhole¶
Drillhole
objects are different from other objects as their 3D geometry is defined by the collar
and surveys
attributes. The vertices
and cells
properties are only instantiated when interval or point log data are added.
[6]:
from geoh5py.objects import Drillhole
# Create a simple well
total_depth = 100
dist = np.linspace(0, total_depth, 10)
azm = np.ones_like(dist) * 45.
dip = np.linspace(-89, -75, dist.shape[0])
collar = np.r_[0., 10., 10]
well = Drillhole.create(
workspace, collar=collar, surveys=np.c_[dist, dip, azm]
)
Surface¶
The Surface
object is also described by vertices
and cells
that form a net of triangles. If omitted on creation, the cells
property is calculated using a 2D scipy.spatial.Delaunay
triangulation.
[7]:
from geoh5py.objects import Surface
from scipy.spatial import Delaunay
# Create a triangulated surface from points
surf_2D = Delaunay(xyz[:, :2])
# Create the Surface object
surface = Surface.create(
workspace,
vertices=points.vertices, # Add vertices
cells=surf_2D.simplices
)
Grid2D¶
The Grid2D
object defines a regular grid of cells
often used to display model sections or to compute data derivatives. A Grid2D
can be oriented in 3D space using the origin
, rotation
and dip
parameters.
[8]:
from geoh5py.objects import Grid2D
# Create the Surface object
grid = Grid2D.create(
workspace,
origin = [25, -75, 50],
u_cell_size = 2.5,
v_cell_size = 2.5,
u_count = 64,
v_count = 16,
rotation = 90.0,
dip = 45.0,
)
BlockModel¶
The BlockModel
object defines a rectilinear grid of cells, also known as a tensor mesh. The cells
center position is determined by cell_delimiters
(offsets) along perpendicular axes (u
, v
, z
) and relative to the origin
. BlockModel
can be oriented horizontally by controlling the rotation
parameter.
[9]:
from geoh5py.objects import BlockModel
# Create the Surface object
blockmodel = BlockModel.create(
workspace,
origin = [25, -100, 50],
u_cell_delimiters=np.cumsum(np.ones(16) * 5), # Offsets along u
v_cell_delimiters=np.cumsum(np.ones(32) * 5), # Offsets along v
z_cell_delimiters=np.cumsum(np.ones(16) * -2.5), # Offsets along z (down)
rotation = 30.0
)
Octree¶
The Octree
object is type of 3D grid that uses a tree structure to define cells
. Each cell can be subdivided it into eight octants allowing for a more efficient local refinement of the mesh. The Octree
object can also be oriented horizontally by controlling the rotation
parameter.
[10]:
from geoh5py.objects import Octree
octree = Octree.create(
workspace,
origin=[25, -100, 50],
u_count=16, # Number of cells in power 2
v_count=32,
w_count=16,
u_cell_size=5.0, # Base cell size (highest octree level)
v_cell_size=5.0,
w_cell_size=2.5, # Offsets along z (down)
rotation=30,
)
By default, the octree mesh will be refined at the lowest level possible along each axes.
Data¶
The geoh5
format allows storing data (values) on different parts of an Object
. The data_association
can be one of:
OBJECT: Single element characterizing the parent object
VERTEX: Array of values associated with the parent object vertices
CELL: Array of values associated with the parent object cells
Note: The length and order of the array provided must be consistent with the corresponding element of association.
The data types supported by geoh5py
are:
Arrays
Integer
Text
Color_map
Add data¶
Data can be added to an Object
entity using the add_data
method.
[11]:
# Create a straight Curve object
curve = Curve.create(
workspace, # The target Workspace
name='FlightLine3',
vertices=np.c_[np.linspace(0, 100, 100), np.zeros(100), np.zeros(100)]
)
# Add a single string comment
curve.add_data({
"my_comment": {
"association":"OBJECT",
"values": "hello_world"
}
})
# Add a vector of floats
curve.add_data({
"my_cell_values": {
"association":"CELL",
"values": np.random.randn(curve.n_cells)
}
})
# Add multiple data vectors on a single call
data = {}
for ii in range(8):
data[f"Period:{ii}"] = {
"association":"VERTEX",
"values": (ii+1) * np.cos(ii*curve.vertices[:, 0]*np.pi/curve.vertices[:, 0].max()/4.)
}
data_list = curve.add_data(data)
print([obj.name for obj in data_list])
['Period:0', 'Period:1', 'Period:2', 'Period:3', 'Period:4', 'Period:5', 'Period:6', 'Period:7']
If the association
argument is omited, geoh5py
will attempt to assign the data to the correct part based on the shape of the data values, either object.n_values
or object.n_cells
The newly created data is directly added to the project’s geoh5
file and available for visualization:
Get data¶
Just like any Entity
, data can be retrieved from the Workspace
using the get_entity
method. For convenience, Objects
also have a get_data_list
and get_data
method that focusses only on their respective children Data
.
[12]:
my_list = curve.get_data_list()
print(my_list, curve.get_data(my_list[0]))
['Period:0', 'Period:1', 'Period:2', 'Period:3', 'Period:4', 'Period:5', 'Period:6', 'Period:7', 'my_cell_values', 'my_comment'] [<geoh5py.data.float_data.FloatData object at 0x7fc4e9ce4b50>]
Well Data¶
In the case of Drillhole objects, data are added as either interval log
or point log
values.
Point Log Data¶
Log data are used to represent measurements recorded at discrete depths along the well path. A depth
attribute is required on creation. If the Drillhole
object already holds point log data, geoh5py
will attempt to match collocated depths within tolerance. By default, depth markers within 1 centimeter are merged (collocation_distance=1e-2
).
[13]:
depths_A = np.arange(0, 50.) # First list of depth
# Second list slightly offsetted on the first few depths
depths_B = np.arange(47.1, 100)
# Add both set of log data with 0.5 m tolerance
well.add_data({
"my_log_values": {
"depth": depths_A,
"values": np.random.randn(depths_A.shape[0]),
},
"log_wt_tolerance": {
"depth": depths_B,
"values": np.random.randn(depths_B.shape[0]),
"collocation_distance": 0.5
}
})
[13]:
[<geoh5py.data.float_data.FloatData at 0x7fc4e9c6f290>,
<geoh5py.data.float_data.FloatData at 0x7fc4f81f7ad0>]
Interval Log Data¶
Interval log data are defined by constant values bounded by a start an end depth. A from-to
attribute is expected on creation. Users can also control matching intervals by supplying a tolerance
argument in meters (default tolerance: 1e-3
meter).
[14]:
# Add some geology as interval data
well.add_data({
"interval_values": {
"values": [1, 2, 3],
"from-to": np.vstack([
[0.25, 25.5],
[30.1, 55.5],
[56.5, 80.2]
]),
"value_map": {
1: "Unit_A",
2: "Unit_B",
3: "Unit_C"
},
"type": "referenced",
}
})
[14]:
<geoh5py.data.referenced_data.ReferencedData at 0x7fc4e9c6fd90>
Property Groups¶
Data
entities sharing the same parent Object
and association
can be linked within a property_groups
and made available through profiling. This can be used to group data that would normally be stored as 2D array.
[15]:
# Add another VERTEX data and create a group with previous
curve.add_data_to_group([obj.name for obj in data_list], "my_trig_group")
[15]:
<geoh5py.groups.property_group.PropertyGroup at 0x7fc4e9c6f4d0>
[16]:
# Update the geoh5 and re-write the Root
workspace.finalize()