使用 mptt 创建 JSON 以反映 Python/Django 中的树结构

fastest way to create JSON to reflect a tree structure in Python / Django using mptt(使用 mptt 创建 JSON 以反映 Python/Django 中的树结构的最快方法)

本文介绍了使用 mptt 创建 JSON 以反映 Python/Django 中的树结构的最快方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在 Python (Django) 中基于 Django 查询集创建 JSON 的最快方法是什么.请注意,按照建议在模板中解析它此处不是一个选项.

What's the fastest way in Python (Django) to create a JSON based upon a Django queryset. Note that parsing it in the template as proposed here is not an option.

背景是我创建了一个循环遍历树中所有节点的方法,但是在转换大约 300 个节点时已经非常慢了.我想到的第一个(可能也是最糟糕的)想法是以某种方式手动"创建 json.请参阅下面的代码.

The background is that I created a method which loops over all nodes in a tree, but is already terribly slow when converting about 300 nodes. The first (and probably the worst) idea that came up to my mind is to create the json somehow "manually". See the code below.

#! Solution 1 !!#
def quoteStr(input):
    return """ + smart_str(smart_unicode(input)) + """

def createJSONTreeDump(user, node, root=False, lastChild=False):
    q = """

    #open tag for object
    json = str("
" + indent + "{" +
                  quoteStr("name") + ": " + quoteStr(node.name) + ",
" +
                  quoteStr("id") + ": " + quoteStr(node.pk) + ",
" +
                )

    childrenTag = "children"
    children = node.get_children()
    if children.count() > 0 :
        #create children array opening tag
        json += str(indent + quoteStr(childrenTag) + ": [")
        #for child in children:
        for idx, child in enumerate(children):
            if (idx + 1) == children.count():
                //recursive call
                json += createJSONTreeDump(user, child, False, True, layout)
            else:
                //recursive call
                json += createJSONTreeDump(user, child, False, False, layout)
        #add children closing tag
        json += "]
"

    #closing tag for object
    if lastChild == False:
        #more children following, add ","
        json += indent + "},
"
    else:
        #last child, do not add ","
        json += indent + "}
"
    return json

要渲染的树结构是由 mptt 构建的树,其中调用 .get_children() 返回节点的所有子节点.

The tree structure to be rendered is a tree build up with mptt, where the call .get_children() returns all children of a node.

模型看起来就这么简单,mptt 负责其他一切.

The model looks as simply as this, mptt taking care of everything else.

class Node(MPTTModel, ExtraManager):
    """
    Representation of a single node
    """ 
    name = models.CharField(max_length=200)
    parent = TreeForeignKey('self', null=True, blank=True, related_name='%(app_label)s_%(class)s_children')

在模板中这样创建的预期 JSON result var root = {{ jsonTree|safe}}

The expected JSON result created like this in the template var root = {{ jsonTree|safe }}

基于 this 答案,我创建了以下代码(绝对是更好的代码)但感觉只是稍微快一点.

Based upon this answer I created the following code (definitely the better code) but feels only slightly faster.

解决方案 2:

def serializable_object(node):
    "Recurse into tree to build a serializable object"
    obj = {'name': node.name, 'id': node.pk, 'children': []}
    for child in node.get_children():
        obj['children'].append(serializable_object(child))
    return obj

import json
jsonTree = json.dumps(serializable_object(nodeInstance))

解决方案 3:

def serializable_object_List_Comprehension(node):
    "Recurse into tree to build a serializable object"
    obj = {
        'name': node.name,
        'id': node.pk,
        'children': [serializable_object(ch) for ch in node.get_children()]
    }
    return obj

解决方案 4:

def recursive_node_to_dict(node):
    result = {
        'name': node.name, 'id': node.pk
    }
    children = [recursive_node_to_dict(c) for c in node.get_children()],
    if children is not None:
        result['children'] = children
    return result

from mptt.templatetags.mptt_tags import cache_tree_children
root_nodes = cache_tree_children(root.get_descendants())
dicts = []
for n in root_nodes:
    dicts.append(recursive_node_to_dict(root_nodes[0]))
    jsonTree = json.dumps(dicts, indent=4)

解决方案5(使用select_related to pre_fetch,而不确定是否正确使用)

Solution 5 (make use of select_related to pre_fetch, whereas not sure if correctly used)

def serializable_object_select_related(node):
    "Recurse into tree to build a serializable object, make use of select_related"
    obj = {'name': node.get_wbs_code(), 'wbsCode': node.get_wbs_code(), 'id': node.pk, 'level': node.level, 'position': node.position, 'children': []}
    for child in node.get_children().select_related():
        obj['children'].append(serializable_object(child))
    return obj

方案6(改进方案4,使用子节点缓存):

Solution 6 (improved solution 4, using caching of child nodes):

def recursive_node_to_dict(node):
    return {
        'name': node.name, 'id': node.pk,
         # Notice the use of node._cached_children instead of node.get_children()
        'children' : [recursive_node_to_dict(c) for c in node._cached_children]
    }

通过以下方式调用:

from mptt.templatetags.mptt_tags import cache_tree_children
subTrees = cache_tree_children(root.get_descendants(include_self=True))
subTreeDicts = []
for subTree in subTrees:
    subTree = recursive_node_to_dict(subTree)
    subTreeDicts.append(subTree)
jsonTree = json.dumps(subTreeDicts, indent=4)
#optional clean up, remove the [ ] at the beginning and the end, its needed for D3.js
jsonTree = jsonTree[1:len(jsonTree)]
jsonTree = jsonTree[:len(jsonTree)-1]

您可以在下面看到分析结果,按照 MuMind 的建议使用 cProfile 创建,设置 Django 视图以启动独立方法 profileJSON(),该方法又调用不同的解决方案来创建 JSON 输出.

Below you can see the profiling results, created using cProfile as suggested by MuMind, setting up a Django view to start the stand-alone method profileJSON(), which in turn calls the different solutions to create the JSON output.

def startProfileJSON(request):
    print "startProfileJSON"
    import cProfile
    cProfile.runctx('profileJSON()', globals=globals(), locals=locals())
    print "endProfileJSON"

结果:

解决方案 1: 3350347 次函数调用(3130372 次原始调用)在 4.969 秒内(详细信息)

Solution 1: 3350347 function calls (3130372 primitive calls) in 4.969 seconds (details)

解决方案 2: 3.630 秒内 2533705 次函数调用(2354516 次原始调用)(详细信息)

Solution 2: 2533705 function calls (2354516 primitive calls) in 3.630 seconds (details)

解决方案 3: 2533621 次函数调用(2354441 次原始调用)在 3.684 秒内(详细信息)

Solution 3: 2533621 function calls (2354441 primitive calls) in 3.684 seconds (details)

解决方案 4: 2812725 次函数调用(2466028 次原始调用)在 3.840 秒内(详细信息)

Solution 4: 2812725 function calls (2466028 primitive calls) in 3.840 seconds (details)

解决方案 5: 3.779 秒内 2536504 次函数调用(2357256 次原始调用)(详情一>)

Solution 5: 2536504 function calls (2357256 primitive calls) in 3.779 seconds (details)

解决方案 6(改进的解决方案 4): 3.663 秒内 2593122 次函数调用(2299165 次原始调用)(详情)

Solution 6 (Improved solution 4): 2593122 function calls (2299165 primitive calls) in 3.663 seconds (details)

讨论:

解决方案1:自己的编码实现.坏主意

Solution 1: own encoding implementation. bad idea

解决方案 2 + 3:目前最快,但仍然很慢

Solution 2 + 3: currently the fastest, but still painfully slow

解决方案 4:缓存孩子看起来很有希望,但确实表现相似并且当前产生无效的 json,因为孩子被放入双 []:

Solution 4: looks promising with caching childs, but does perform similar and currently produces not valid json as childrens are put into double []:

"children": [[]] instead of "children": []

解决方案 5:使用 select_related 没有什么区别,但可能以错误的方式使用,因为节点总是对其父节点有一个外键,而我们正在从根解析到子节点.

Solution 5: use of select_related does not make a difference, whereas probably used in the wrong way, as a node always have a ForeignKey to its parent, and we are parsing from root to child.

更新:解决方案 6:对我来说,它看起来是最干净的解决方案,使用子节点的缓存.但只执行类似于解决方案 2 + 3.这对我来说很奇怪.

Update: Solution 6: It looks like the cleanest solution to me, using caching of child nodes. But does only perform similar to solution 2 + 3. Which for me is strange.

有没有更多关于性能改进的想法?

Anybody more ideas for performance improvements?

推荐答案

我怀疑到目前为止最大的减速是每个节点执行 1 个数据库查询.与数据库的数百次往返相比,json 呈现是微不足道的.

I suspect by far the biggest slowdown is that this will do 1 database query per node. The json rendering is trivial in comparison to the hundreds of round-trips to your database.

您应该在每个节点上缓存子节点,以便可以一次性完成这些查询.django-mptt 有一个 cache_tree_children() 功能,你可以做到这一点.

You should cache the children on each node so that those queries can be done all at once. django-mptt has a cache_tree_children() function you can do this with.

import json
from mptt.templatetags.mptt_tags import cache_tree_children

def recursive_node_to_dict(node):
    result = {
        'id': node.pk,
        'name': node.name,
    }
    children = [recursive_node_to_dict(c) for c in node.get_children()]
    if children:
        result['children'] = children
    return result

root_nodes = cache_tree_children(Node.objects.all())
dicts = []
for n in root_nodes:
    dicts.append(recursive_node_to_dict(n))

print json.dumps(dicts, indent=4)

自定义 json 编码,虽然它在某些情况下可能会提供轻微的加速,但我非常不鼓励这样做,因为它会有很多代码,而且很容易获得 非常错误.

Custom json encoding, while it might provide a slight speedup in some scenarios, is something I'd highly discourage, as it will be a lot of code, and it's something that's easy to get very wrong.

这篇关于使用 mptt 创建 JSON 以反映 Python/Django 中的树结构的最快方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!

本文标题为:使用 mptt 创建 JSON 以反映 Python/Django 中的树结构