使用 python 将点分隔值转换为 go 结构



这是对可以更改配置的应用程序的特定要求(特别是 wso2 identity server,因为我正在使用 go 为其编写 kubernetes 运算符)。但这里确实不相关。我想创建一个解决方案,允许轻松管理大量配置映射以生成 go 结构。这些配置映射在 .csv 中

链接到 .csv – my_configs.csv

我想要,编写一个自动生成 go 结构的 python 脚本,这样对应用程序配置的任何更改都可以通过简单地执行 python 脚本创建相应的 go 结构来更新。我指的是应用程序本身的配置。例如,可以更改 csv 中的 toml 键名称/可以添加新值。

到目前为止,我已经成功创建了一个 python 脚本,几乎实现了我的目标。脚本是,


import pandas as pddef convert_to_dict(data):    result = {}    for row in data:        current_dict = result        for item in row[:-1]:            if item is not none:                if item not in current_dict:                    current_dict[item] = {}                current_dict = current_dict[item]    return resultdef extract_json_key(yaml_key):    if isinstance(yaml_key, str) and '.' in yaml_key:        return yaml_key.split('.')[-1]    else:        return yaml_keydef add_fields_to_struct(struct_string,go_var,go_type,json_key,toml_key):    struct_string += str(go_var) + " " + str(go_type) + ' `json:"' + str(json_key) + ',omitempty" toml:"' +str(toml_key) + '"` ' + ""    return struct_stringdef generate_go_struct(struct_name, struct_data):    struct_name="configurations" if struct_name == "" else struct_name    struct_string = "type " + struct_name + " struct {"    yaml_key=df['yaml_key'].str.split('.').str[-1]        # base case: generate fields for the current struct level        for key, value in struct_data.items():        selected_rows = df[yaml_key == key]        if len(selected_rows) > 1:            go_var = selected_rows['go_var'].values[1]            toml_key = selected_rows['toml_key'].values[1]            go_type=selected_rows['go_type'].values[1]            json_key=selected_rows['json_key'].values[1]        else:            go_var = selected_rows['go_var'].values[0]            toml_key = selected_rows['toml_key'].values[0]            go_type=selected_rows['go_type'].values[0]            json_key=selected_rows['json_key'].values[0]        # add fields to the body of the struct        struct_string=add_fields_to_struct(struct_string,go_var,go_type,json_key,toml_key)       struct_string += "}"        # recursive case: generate struct definitions for nested structs    for key, value in struct_data.items():        selected_rows = df[yaml_key == key]        if len(selected_rows) > 1:            go_var = selected_rows['go_var'].values[1]        else:            go_var = selected_rows['go_var'].values[0]        if isinstance(value, dict) and any(isinstance(v, dict) for v in value.values()):            nested_struct_name = go_var            nested_struct_data = value            struct_string += generate_go_struct(nested_struct_name, nested_struct_data)        return struct_string# read excelcsv_file = "~/downloads/my_configs.csv"df = pd.read_csv(csv_file)# remove rows where all columns are nandf = df.dropna(how='all')# create the 'json_key' column using the custom functiondf['json_key'] = df['yaml_key'].apply(extract_json_key)data=df['yaml_key'].values.tolist() # read the 'yaml_key' columndata = pd.dataframe({'column':data}) # convert to dataframedata=data['column'].str.split('.', expand=true) # split by '.'nested_list = data.values.tolist() # convert to nested listdata=nested_list result_json = convert_to_dict(data) # convert to dict (json)# the generated co codego_struct = generate_go_struct("", result_json)# write to filefile_path = "output.go"with open(file_path, "w") as file:    file.write(go_struct)


问题是(查看 csv 的下面部分),



这里,由于 basic 和 totp 字段 parameters 重复,因此脚本会混淆自身并生成两个 totpparameters 结构。预期结果是具有 basicparameters 和 totpparameters 结构。 csv 的 yaml_key 列中存在许多类似的重复单词。

我知道这与 go_var = selected_rows[‘go_var’].values[1] 中索引被硬编码为 1 有关,但很难修复此问题。


递归函数的问题生成 json 的代码存在问题可能是此问题的根本原因。


我也尝试过使用 chatgpt,但是由于这与嵌套和递归有关,因此 chatgpt 提供的答案不是很有效。


我发现包含 properties、pooloptions、endpoint 和 parameters 字段的行存在问题。这是因为它们在 yaml_key 列中重复。


我能够解决这个问题。但是,我必须完全使用一种新方法来解决问题,即使用树数据结构,然后遍历它。这是其背后的主要逻辑 – https://www.geeksforgeeks.org/level-顺序树遍历/


import pandas as pdfrom collections import dequestructs=[]class TreeNode:    def __init__(self, name):        self.name = name        self.children = []        self.path=""    def add_child(self, child):        self.children.append(child)def create_tree(data):    root = TreeNode('')    for item in data:        node = root        for name in item.split('.'):            existing_child = next((child for child in node.children if child.name == name), None)            if existing_child:                node = existing_child            else:                new_child = TreeNode(name)                node.add_child(new_child)                node = new_child    return rootdef generate_go_struct(struct_data):    struct_name = struct_data['struct_name']    fields = struct_data['fields']        go_struct = f"type {struct_name} struct {{"    for field in fields:        field_name = field['name']        field_type = field['type']        field_default_val = str(field['default_val'])        json_key=field['json_key']        toml_key=field['toml_key']        tail_part=f"{field_name} {field_type} `json:"{json_key},omitempty" toml:"{toml_key}"`"        if pd.isna(field['default_val']):            go_struct += tail_part        else:            field_default_val = "// +kubebuilder:default:=" + field_default_val            go_struct += field_default_val + "" + tail_part    go_struct += "}"    return go_structdef write_go_file(go_structs, file_path):    with open(file_path, 'w') as file:        for go_struct in go_structs:            file.write(go_struct)def create_new_struct(struct_name):    struct_name = "Configurations" if struct_name == "" else struct_name    struct_dict = {        "struct_name": struct_name,        "fields": []    }        return struct_dictdef add_field(struct_dict, field_name, field_type,default_val,json_key, toml_key):    field_dict = {        "name": field_name,        "type": field_type,        "default_val": default_val,        "json_key":json_key,        "toml_key":toml_key    }    struct_dict["fields"].append(field_dict)        return struct_dictdef traverse_tree(root):    queue = deque([root])      while queue:        node = queue.popleft()        filtered_df = df[df['yaml_key'] == node.path]        go_var = filtered_df['go_var'].values[0] if not filtered_df.empty else None        go_type = filtered_df['go_type'].values[0] if not filtered_df.empty else None        if node.path=="":            go_type="Configurations"        # The structs themselves        current_struct = create_new_struct(go_type)                for child in node.children:              if (node.name!=""):                child.path=node.path+"."+child.name               else:                child.path=child.name            filtered_df = df[df['yaml_key'] == child.path]            go_var = filtered_df['go_var'].values[0] if not filtered_df.empty else None            go_type = filtered_df['go_type'].values[0] if not filtered_df.empty else None            default_val = filtered_df['default_val'].values[0] if not filtered_df.empty else None            # Struct fields            json_key = filtered_df['yaml_key'].values[0].split('.')[-1] if not filtered_df.empty else None            toml_key = filtered_df['toml_key'].values[0].split('.')[-1] if not filtered_df.empty else None                        current_struct = add_field(current_struct, go_var, go_type,default_val,json_key, toml_key)            if (child.children):                # Add each child to the queue for processing                queue.append(child)        go_struct = generate_go_struct(current_struct)        # print(go_struct,"")                structs.append(go_struct)    write_go_file(structs, "output.go")csv_file = "~/Downloads/my_configs.csv"df = pd.read_csv(csv_file) sample_data=df['yaml_key'].values.tolist()# Create the treetree = create_tree(sample_data)# Traverse the treetraverse_tree(tree)



