[PYTHON] 파이썬

250x250

Today

Total

Recent Posts

Archives

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Tags more

관리 메뉴

겉바속촉

[PYTHON] 파이썬_실습 본문

IT 일기 (상반기)/PYTHON

[PYTHON] 파이썬_실습

겉바속촉 2021. 1. 6. 12:05

728x90

안녕하세요

겉바속촉입니다

!^^!

!!파이썬 시작하기!!

파일 읽기, 쓰기 배운 것들

활용해서

실습 해보겠습니다

가장 큰 값 찾기

data.txt

Annual Number of Lynx Trapped, MacKenzie River, 1821-1934
#Original Source: Elton, C. and Nicholson, M. (1942)
#"The ten year cycle in numbers of Canadian lynx",
#J. Animal Ecology, Vol. 11, 215--244.
#This is the famous data set which has been listed before in
#various publications:
#Cambell, M.J. and Walker, A.M. (1977) "A survey of statistical work on
#the MacKenzie River series of annual Canadian lynx trappings for the years
#1821-1934 with a new analysis", J.Roy.Statistical Soc. A 140, 432--436.
  269.  321.  585.  871. 1475. 2821. 3928. 5943. 4950. 2577.  523.   98.       
  184.  279.  409. 2285. 2685. 3409. 1824.  409.  151.   45.   68.  213.    	
  546. 1033. 2129. 2536.  957.  361.  377.  225.  360.  731. 1638. 2725.    	
 2871. 2119.  684.  299.  236.  245.  552. 1623. 3311. 6721. 4245.  687.    	
  255.  473.  358.  784. 1594. 1676. 2251. 1426.  756.  299.  201.  229.    	
  469.  736. 2042. 2811. 4431. 2511.  389.   73.   39.   49.   59.  188.    	
  377. 1292. 4031. 3495.  587.  105.  153.  387.  758. 1307. 3465. 6991.    	
 6313. 3794. 1836.  345.  382.  808. 1388. 2713. 3800. 3091. 2985. 3790.    	
  674.   81.   80.  108.  229.  399. 1132. 2432. 3574. 2935. 1537.  529.

sample.py

from typing import TextIO

# 헤더 건너뛰고 첫 데이터를 추출할 함수
def skip_header(reader: TextIO) -> str:
    '''reader 내 헤더를 건너뛰고 첫 번째 데이터를 반환한다'''

    #설명 줄을 읽는다.
    line = reader.readline()

    #주석문이 아닌 줄을 찾는다
    line = reader.readline()
    while line.startswith('#'):
        line = reader.readline()

    #읽어들인 첫번쨰 데이터를 반환한다
    return line


#해당 라인에서 가장 큰 값을 구해서 반환하는 함수
def find_largest(line: str) -> int:
    largest = -1

    for value in line.split():
        v = int(value[:-1])
        if largest < v:
            largest = v

    return largest


#파일을 읽어서 출력 함수
def process_file(reader: TextIO) -> None:
    #첫번째 데이터를 찾아서 출력
    line = skip_header(reader).strip()
    largest = find_largest(line)

    #첫번째 이후에 나오는 나머지 데이터를 출력
    for line in reader:
        value = find_largest(line)
        if largest < value:
            largest = value
    print(largest)
    
with open('data.txt', 'r') as input_file:
    process_file(input_file)

결과

근데 이제 모듈화를 시켜볼까요??

파일하나 더 생성하겠습니다:)

보통 헤더 건너 뛰는 부분을 다른 곳에서도 많이 쓰기 떄문에

file_handler.py 생성해서 skip_header부분만 떼어와서 붙여넣기 할게요!!

from typing import TextIO


# 헤더 건너뛰고 첫 데이터를 추출할 함수
def skip_header(reader: TextIO) -> str:
    '''reader 내 헤더를 건너뛰고 첫 번째 데이터를 반환한다'''

    #설명 줄을 읽는다.
    line = reader.readline()

    #주석문이 아닌 줄을 찾는다
    line = reader.readline()
    while line.startswith('#'):
        line = reader.readline()

    #읽어들인 첫번쨰 데이터를 반환한다
    return line

그리고 sample.py에는 import를 해주는 거에요

그래서 sample.py 코드

from typing import TextIO
import file_handler

#해당 라인에서 가장 큰 값을 구해서 반환하는 함수
def find_largest(line: str) -> int:
    largest = -1

    for value in line.split():
        v = int(value[:-1])
        if largest < v:
            largest = v

    return largest


#파일을 읽어서 출력 함수
def process_file(reader: TextIO) -> None:
    #첫번째 데이터를 찾아서 출력
    line = file_handler.skip_header(reader).strip()
    largest = find_largest(line)

    #첫번째 이후에 나오는 나머지 데이터를 출력
    for line in reader:
        value = find_largest(line)
        if largest < value:
            largest = value
    print(largest)

with open('data.txt', 'r') as input_file:
    process_file(input_file)

그런데 여기서 주의할 점은

모듈화 해서 임포트해올 때 파이썬은 자동실행된다는 것을 배웠습니다

그래서 중복실행이 되지 않도록 다음 코드 넣어주세요

if __name__ == '__main__':
	with open('data.txt', 'r') as input_file:
    process_file(input_file)
	print(largest)

즉, 자기 자신이 실행될 때에만 돌도록 해주는 거에요!!

동일한 구조의 레코드가 여러개 있는 예제

pdb_data.txt

COMPND  	AMMONIA 
ATOM  	1  N  0.257  -0.363   0.000   
ATOM  	2  H  0.257   0.727   0.000 
ATOM  	3  H  0.771  -0.727   0.890 
ATOM  	4  H  0.771  -0.727  -0.890
END
COMPND  	METHANOL
ATOM  	1  C  -0.748  -0.015   0.024
ATOM  	2  O  0.558   0.420  -0.278
ATOM  	3  H  -1.293  -0.202  -0.901
ATOM  	4  H  -1.263   0.754   0.600
ATOM  	5  H  -0.699  -0.934   0.609
ATOM  	6  H  0.716   1.404   0.137
END

molecule.py

from typing import TextIO

def read_molecule(reader: TextIO) -> list:
    '''reader에서 분자 하나를 읽어 반환하거나,
    None을 반환해서 파일이 끝났음을 알린다.

    결과 내 첫번째 항목은 분자명이고.
    각 리스트는 원자 타입, X, Y, Z 좌표이다.

    COMPND  	AMMONIA 
    ATOM  	1  N  0.257  -0.363   0.000   
    ATOM  	2  H  0.257   0.727   0.000 
    ATOM  	3  H  0.771  -0.727   0.890 
    ATOM  	4  H  0.771  -0.727  -0.890
    END

    ['AMNONIA',['N', '0.257',  '-0.363',  '0.000'],  ['N', '0.257',  '-0.363',   '0.000']]
    '''

    line = reader.readline()
    if not line:
        return None

    # 분자의 이름을 추출
    parts = line.split()
    name = parts[1]

    # 반환할 분자정보를 담은 리스트에 분자 이름을 추가
    molecule = [name]

    # END가 나올 때까지 원자 정보를 읽어서 리스트에 추가
    reading = True
    while reading:
        line = reader.readline()
        if line.startswith("END"):
            reading = False
        else:
            parts = line.split()    #실행이 되면 이런 정보가 들어있다 ['ATOM', '1', 'N', '0.257',  '-0.363',  '0.000']
            molecule.append(parts[2:])

    return molecule


def read_all_molecules(reader: TextIO) -> list:
    ''' reader에서 0개 이상의 분자를 읽어 분자 정보 리스트를 반환한다.
    '''

    # 분자 정보 리스트
    '''
    results = 
    [
        ['ATOM', '1', 'N', '0.257',  '-0.363',  '0.000'],
        ['ATOM', '1', 'N', '0.257',  '-0.363',  '0.000'],
        ['ATOM', '1', 'N', '0.257',  '-0.363',  '0.000']
    ]
    '''
    results = []

    reading = True
    while reading:
        molecule = read_molecule(reader)
        if molecule:
            results.append(molecule)
        else:
            reading = False

    return results

if __name__ == "__main__":
     with open('pdb_data.txt', 'r') as file:
         molecules = read_all_molecules(file)
         print(molecules)

결과

그러면 END가 없다면 이제 코드를 어떻게 바꿔주어야 할까?

pdb_data.txt

COMPND  	AMMONIA 
ATOM  	1  N  0.257  -0.363   0.000   
ATOM  	2  H  0.257   0.727   0.000 
ATOM  	3  H  0.771  -0.727   0.890 
ATOM  	4  H  0.771  -0.727  -0.890
COMPND  	METHANOL
ATOM  	1  C  -0.748  -0.015   0.024
ATOM  	2  O  0.558   0.420  -0.278
ATOM  	3  H  -1.293  -0.202  -0.901
ATOM  	4  H  -1.263   0.754   0.600
ATOM  	5  H  -0.699  -0.934   0.609
ATOM  	6  H  0.716   1.404   0.137

molecule.py

from typing import TextIO

def read_molecule(reader: TextIO, line: str) -> list:

    fields = line.split()
    molecule = [fields[1]]  # 분자의 이름을 추가

    line = reader.readline()
    while line and not line.startswith('COMPND'):
        fields = line.split()
        if fields[0] == 'ATOM':
            # molecule.append(fields[2:])
            key, num, atom_type, x, y, z = fields
            molecule.append([atom_type, x, y, z])
        line = reader.readline()
    
    return molecule, line




def read_all_molecules(reader: TextIO) -> list:
    ''' reader에서 0개 이상의 분자를 읽어 분자 정보 리스트를 반환한다.
    '''

    # 분자 정보 리스트
    results = []

    line = reader.readline()
    while line:
        molecule, line = read_molecule(reader, line)
        results.append(molecule)
        
    return results
    

if __name__ == "__main__":
    with open('pdb_data.txt', 'r') as file:
        molecules = read_all_molecules(file)
        print(molecules)

결과

728x90

저작자표시 비영리 변경금지 (새창열림)

'IT 일기 (상반기) > PYTHON' 카테고리의 다른 글

[PYTHON] 파이썬 _가장 작은 두 값 찾기 (0)	2021.01.06
[PYTHON] 파이썬_정규식 & re 활용 (0)	2021.01.06
[PYTHON] 파이썬_StringIO, 파일 읽기 방법 활용 (0)	2021.01.05
[PYTHON] 파이썬_파일쓰기 (0)	2021.01.05
[PYTHON] 파이썬_파일 읽기 연습하기 (0)	2021.01.05

'IT 일기 (상반기)/PYTHON' Related Articles