2013年1月2日 星期三

[Python] PEP 8:Python程式碼風格指南 (1) ~ 介紹、程式碼排版、運算式與敘述中的空白


介紹

 這份文件說明了包含 Python 標準函式庫在內的各類型 Python 程式碼所應該遵循的程式碼撰寫慣例。同時本文亦可與說明以 C code 實做的 Python 風格指南 PEP 合併參考[1]

 這份文件與 PEP 257 (Docstring 慣例) 延伸自 Guido van Rossum (Python發明人) 原本的 Python 風格指南精要,同時也加入了 Barry A. Warsaw (Mailman的維護人) 的風格指南內容 [2]

愚蠢的一致性乃庸人的心魔

(譯註:原句 "A foolish consistency is the hobgoblin of little minds" 為美國思想家 Ralph Waldo Emerson 的名言之一,本來應譯為「愚蠢的附和乃庸人的心魔」,有「愚人所見亦同」的意思在。)

 Guido 的關鍵見解之一就是:程式碼被拿來看的時間遠遠多餘寫它的時間。因此這份指南試圖提升 Python 程式碼的可讀性,並希望能使在不同領域與不同應用中的 Python 程式碼都能保持一致性。就像 PEP 20 說的:「可讀性最大。」

 風格指南的意義在於維持一制性。遵從這份指南並以其中的方式維持程式碼的一致性是重要的。在專案中的程式碼一致性更為重要。在模組與函式中的程式碼一致性最是重要。

 但最最最重要的是:要知道何時不再維持這種一致性-有時候這份風格指南真的是不適用的。當你有遲疑時,就請依靠你自己的最佳判斷吧!去網路上翻翻其他的例子,然後再決定怎樣看起來會是最好的。而且絕對要不恥下問其他有經驗的設計師!

 在兩個情況下是可以破壞這些所謂一致性規則的:
  1. 即便對已經熟悉這些規則的人來說,用了這些規則後反而會降低可讀性時。
  2. 為了與已完成但並不遵守此處規則的程式碼達成一致性時(可能是一些歷史因素)- 雖然這可能也是清理掉某人亂搞的一個好時機:P(真正的 XP 風格)。

 

程式碼排版相關

縮排

 對於每個縮排層級都請使用 4 個空白。

 如果你不想搞亂那些真的很古老的程式碼的話,你可以繼續使用 8 個空格長的tab。

 因長度過長而必須斷行的程式碼應該要對齊其中所包裹的元素:括號、中括號或大括號中的元素如果過多而必須斷行時,請將其中的元素對齊括號,讓它們視覺上感覺是一整組的;或使用懸吊縮排(譯註:也就是第一個括號後馬上將其中元素置於第二行)。當使用懸吊縮排時,請使用以下建議:第一行不應該出現任何參數,而且應該使用更多的縮排來使該斷行與其它的程式碼作區隔。

好:
# 以起始符號對齊
foo = long_function_name(var_one, var_two,
                         var_three, var_four)

# 運用更多的縮排來區分括號內容與其他部分
def long_function_name(
        var_one, var_two, var_three,
        var_four):
    print(var_one)

不好:
# Arguments on first line forbidden when not using vertical alignment
foo = long_function_name(var_one, var_two,
    var_three, var_four)

# Further indentation required as indentation is not distinguishable
def long_function_name(
    var_one, var_two, var_three,
    var_four):
    print(var_one)

選擇性使用:
# 不使用更多的其餘縮排.
foo = long_function_name(
  var_one, var_two,
  var_three, var_four)

Tabs 或 Spaces?

 絕對不要混用 tabs 與空白。

 在 Python 中最受歡迎的縮排方式就是僅使用空白。第二收歡迎的方式就是只使用 tabs。混用空白與 tabs 來縮排的程式碼應該要被轉換成完全空白縮排。當以 -t 選項呼叫 Python 命令列直譯器,且程式碼中出現了不合法的空白與 tabs 混用時,命令列將會出現警告。如果使用的選項是 -tt,警告將變成錯誤。我們強烈建議使用這些選項!

 對於新的專案,我們強烈建議只使用空白來縮排;大多數的編輯器都有相關的功能可以輕易的完成這件事。

每行最大長度

 將每行的字元數限制在最多 79 的字元。

 There are still many devices around that are limited to 80 character lines; plus, limiting windows to 80 characters makes it possible to have several windows side-by-side. The default wrapping on such devices disrupts the visual structure of the code, making it more difficult to understand. Therefore, please limit all lines to a maximum of 79 characters. For flowing long blocks of text (docstrings or comments), limiting the length to 72 characters is recommended.
 The preferred way of wrapping long lines is by using Python's implied line continuation inside parentheses, brackets and braces. Long lines can be broken over multiple lines by wrapping expressions in parentheses. These should be used in preference to using a backslash for line continuation. Make sure to indent the continued line appropriately. The preferred place to break around a binary operator isafter the operator, not before it. Some examples:
class Rectangle(Blob):

    def __init__(self, width, height,
                 color='black', emphasis=None, highlight=0):
        if (width == 0 and height == 0 and
            color == 'red' and emphasis == 'strong' or
            highlight > 100):
            raise ValueError("sorry, you lose")
        if width == 0 and height == 0 and (color == 'red' or
                                           emphasis is None):
            raise ValueError("I don't think so -- values are %s, %s" %
                             (width, height))
        Blob.__init__(self, width, height,
                      color, emphasis, highlight)

空白行

 以兩行空白行來區分頂層函數與類別定義。

 類別內的方法定義則以單空白行分隔。

 額外的空白行可能被用來分隔相關函式的群組(偶爾出現)。原本應該用來分隔函式的空白行,在一堆只有單行實作的情況(例如抽象函式的必需實作)則可以省略掉。

 偶爾在函式中以空白行來區分邏輯區塊。

 Python accepts the control-L (i.e. ^L) form feed character as whitespace; Many tools treat these characters as page separators, so you may use them to separate pages of related sections of your file. Note, some editors and web-based code viewers may not recognize control-L as a form feed and will show another glyph in its place.

編碼 (PEP 263)

Code in the core Python distribution should always use the ASCII or Latin-1 encoding (a.k.a. ISO-8859-1). For Python 3.0 and beyond, UTF-8 is preferred over Latin-1, see PEP 3120.
Files using ASCII should not have a coding cookie. Latin-1 (or UTF-8) should only be used when a comment or docstring needs to mention an author name that requires Latin-1; otherwise, using \x\u or\U escapes is the preferred way to include non-ASCII data in string literals.
For Python 3.0 and beyond, the following policy is prescribed for the standard library (see PEP 3131): All identifiers in the Python standard library MUST use ASCII-only identifiers, and SHOULD use English words wherever feasible (in many cases, abbreviations and technical terms are used which aren't English). In addition, string literals and comments must also be in ASCII. The only exceptions are (a) test cases testing the non-ASCII features, and (b) names of authors. Authors whose names are not based on the latin alphabet MUST provide a latin transliteration of their names.
Open source projects with a global audience are encouraged to adopt a similar policy.

Imports

  • Imports should usually be on separate lines, e.g.:
    Yes: import os
         import sys
    
    No:  import sys, os
    
    It's okay to say this though:
    from subprocess import Popen, PIPE
    
  • Imports are always put at the top of the file, just after any module comments and docstrings, and before module globals and constants.
    Imports should be grouped in the following order:
    1. standard library imports
    2. related third party imports
    3. local application/library specific imports
    You should put a blank line between each group of imports.
    Put any relevant __all__ specification after the imports.
  • Relative imports for intra-package imports are highly discouraged. Always use the absolute package path for all imports. Even now that PEP 328 is fully implemented in Python 2.5, its style of explicit relative imports is actively discouraged; absolute imports are more portable and usually more readable.
  • When importing a class from a class-containing module, it's usually okay to spell this:
    from myclass import MyClass
    from foo.bar.yourclass import YourClass
    
    If this spelling causes local name clashes, then spell them
    import myclass
    import foo.bar.yourclass
    
    and use "myclass.MyClass" and "foo.bar.yourclass.YourClass".

運算式與敘述中的空白

小毛病

 請避免以下狀況中冗餘的空白:
  • Immediately inside parentheses, brackets or braces.
    好: spam(ham[1], {eggs: 2})
    不好:  spam( ham[ 1 ], { eggs: 2 } )
    
  • Immediately before a comma, semicolon, or colon:
    好: if x == 4: print x, y; x, y = y, x
    :  if x == 4 : print x , y ; x , y = y , x
    
  • Immediately before the open parenthesis that starts the argument list of a function call:
    好: spam(1)
    不好:  spam (1)
    
  • Immediately before the open parenthesis that starts an indexing or slicing:
    好: dict['key'] = list[index]
    不好:  dict ['key'] = list [index]
    
  • More than one space around an assignment (or other) operator to align it with another.
    好:
    x = 1
    y = 2
    long_variable = 3
    
    不好:
    x             = 1
    y             = 2
    long_variable = 3
    

其他建議

  • Always surround these binary operators with a single space on either side: assignment (=), augmented assignment (+=-= etc.), comparisons (==<>!=<><=>=innot inisis not), Booleans (andornot).
  • If operators with different priorities are used, consider adding whitespace around the operators with the lowest priority(ies). Use your own judgement; however, never use more than one space, and always have the same amount of whitespace on both sides of a binary operator.
    Yes:
    i = i + 1
    submitted += 1
    x = x*2 - 1
    hypot2 = x*x + y*y
    c = (a+b) * (a-b)
    
    No:
    i=i+1
    submitted +=1
    x = x * 2 - 1
    hypot2 = x * x + y * y
    c = (a + b) * (a - b)
    
  • Don't use spaces around the = sign when used to indicate a keyword argument or a default parameter value.
    Yes:
    def complex(real, imag=0.0):
        return magic(r=real, i=imag)
    
    No:
    def complex(real, imag = 0.0):
        return magic(r = real, i = imag)
    
  • Compound statements (multiple statements on the same line) are generally discouraged.
    較好:
    if foo == 'blah':
        do_blah_thing()
    do_one()
    do_two()
    do_three()
    

    比較不好:
    if foo == 'blah': do_blah_thing()
    do_one(); do_two(); do_three()
    
  • While sometimes it's okay to put an if/for/while with a small body on the same line, never do this for multi-clause statements. Also avoid folding such long lines!
    比較不好
    if foo == 'blah': do_blah_thing()
    for x in lst: total += x
    while t < 10: t = delay()
    
    Definitely not:
    if foo == 'blah': do_blah_thing()
    else: do_non_blah_thing()
    
    try: something()
    finally: cleanup()
    
    do_one(); do_two(); do_three(long, argument,
                                 list, like, this)
    
    if foo == 'blah': one(); two(); three()
    

沒有留言:

張貼留言